PhD Dissertation

Continuous-Time Modeling Using

Lévy-Driven Moving Averages

Representations, Limit Theorems and Other Properties

Mikkel Slot Nielsen

Department of Mathematics

Aarhus University

2019

Continuous-time modeling using Lévy-driven moving averages

Representations, limit theorems and other properties

PhD dissertation by

Mikkel Slot Nielsen

Department of Mathematics, Aarhus University

Ny Munkegade 118, 8000 Aarhus C, Denmark

Supervised by

Associate Professor Andreas Basse-O’Connor

Associate Professor Jan Pedersen

Submitted to Graduate School of Science and Technology, Aarhus, July 3, 2019

Dissertation was typesat in kpfonts with

pdfL

X and the memoir class

DEPARTMENT OF MATHEMATICS

AARHUS

UNIVERSITY

Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

Resumé . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

Introduction 1

1 A Wold–Karhunen type decomposition and the Lévy-driven

moving averages . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

2 Dynamic models for Lévy-driven moving averages . . . . . . . 6

Limit theorems for quadratic forms and related quantities of

Lévy-driven moving averages . . . . . . . . . . . . . . . . . . . 16

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

Paper A Equivalent martingale measures for Lévy-driven moving

averages and related processes 23

by Andreas Basse-O’Connor, Mikkel Slot Nielsen and Jan Pedersen

1 Introduction and a main result . . . . . . . . . . . . . . . . . . . 23

2 Further main results . . . . . . . . . . . . . . . . . . . . . . . . . 26

3 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

Paper B Stochastic delay diﬀerential equations and related

autoregressive models 45

by Andreas Basse-O’Connor, Mikkel Slot Nielsen, Jan Pedersen

and Victor Rohde

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

2 The SDDE setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3 The level model . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4 Proofs and technical results . . . . . . . . . . . . . . . . . . . . 58

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

Paper C Recovering the background noise of a Lévy-driven CARMA

process using an SDDE approach 69

by Mikkel Slot Nielsen and Victor Rohde

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

2 CARMA processes and their dynamic SDDE representation . . 70

3 Estimation of the SDDE parameters . . . . . . . . . . . . . . . . 74

4 A simulation study, p = 2 . . . . . . . . . . . . . . . . . . . . . . 75

5 Conclusion and further research . . . . . . . . . . . . . . . . . . 77

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

Paper D Multivariate stochastic delay diﬀerential equations and CAR

representations of CARMA processes 81

by Andreas Basse-O’Connor, Mikkel Slot Nielsen, Jan Pedersen and

Victor Rohde

1 Introduction and main ideas . . . . . . . . . . . . . . . . . . . . 81

2 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

3 Stochastic delay diﬀerential equations . . . . . . . . . . . . . . 85

4 Examples and further results . . . . . . . . . . . . . . . . . . . . 86

5 Proofs and auxiliary results . . . . . . . . . . . . . . . . . . . . 93

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

Paper E Stochastic diﬀerential equations with a fractionally ﬁltered delay:

a semimartingale model for long-range dependent processes 107

by Richard A. Davis, Mikkel Slot Nielsen and Victor Rohde

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

3 The stochastic fractional delay diﬀerential equation . . . . . . . 112

4 Delays of exponential type . . . . . . . . . . . . . . . . . . . . . 116

5 Simulation from the SFDDE . . . . . . . . . . . . . . . . . . . . 120

6 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

7 Supplement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

Paper F Limit theorems for quadratic forms and related quantities of

discretely sampled continuous-time moving averages 137

by Mikkel Slot Nielsen and Jan Pedersen

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

3 Further results and examples . . . . . . . . . . . . . . . . . . . 142

4 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

Paper G On non-stationary solutions to MSDDEs: representations and

the cointegration space 159

by Mikkel Slot Nielsen

1 Introduction and main results . . . . . . . . . . . . . . . . . . . 159

2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

General results on existence, uniqueness and representations

of solutions to MSDDEs . . . . . . . . . . . . . . . . . . . . . . . 163

4 Cointegrated multivariate CARMA processes . . . . . . . . . . 168

5 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

Paper H Low frequency estimation of Lévy-driven moving averages 181

by Mikkel Slot Nielsen

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

2 Estimators of interest and asymptotic results . . . . . . . . . . 183

3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190

Paper I A statistical view on a surrogate model for estimating extreme

events with an application to wind turbines 193

by Mikkel Slot Nielsen and Victor Rohde

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

2 The model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

3 Application to extreme event estimation for wind turbines . . . 198

4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205

iii

Preface

This dissertation is the result of my PhD studies carried out from May 1, 2015 to

July 3, 2019 at Department of Mathematics, Aarhus University, under supervision

of Andreas Basse-O’Connor (main supervisor) and Jan Pedersen (co-supervisor). My

studies were fully funded by Andreas’ grant (DFF–4002–00003) from the Danish

Council for Independent Research.

The dissertation consists of the following nine self-contained papers:

Paper A

Equivalent martingale measures for Lévy-driven moving averages and

related processes. Stochastic Processes and their Applications 128(8), 2538–

2556.

Paper B

Stochastic delay diﬀerential equations and related autoregressive models.

Stochastics (forthcoming), 24 pages.

Paper C

Recovering the background noise of a Lévy-driven CARMA process using

an SDDE approach. Proceedings ITISE 2017 2, 707–718.

Paper D

Multivariate stochastic delay diﬀerential equations and CAR representa-

tions of CARMA processes. Stochastic Processes and their Applications (forth-

coming), 25 pages.

Paper E

Stochastic diﬀerential equations with a fractionally ﬁltered delay: a semi-

martingale model for long-range dependent processes. Bernoulli (forthcom-

ing), 30 pages.

Paper F

Limit theorems for quadratic forms and related quantities of discretely

sampled continuous-time moving averages. ESAIM: Probability and Statistics

(forthcoming), 20 pages.

Paper G

On non-stationary solutions to MSDDEs: representations and the cointe-

gration space. Submitted.

Paper H Low frequency estimation of Lévy-driven moving averages. Submitted.

Paper I

A statistical view on a surrogate model for estimating extreme events with

an application to wind turbines. In preparation.

Up to notation and minor adjustments, Papers A–H align with their published or

submitted version. Main parts of Papers A–C are written during the ﬁrst two years of

my PhD studies, and thus these were also included in my progress report used for

the qualifying examination, after which I obtained a master’s degree in mathematical

economics. While a few of the ideas of Papers D and F were brieﬂy discussed in the

Preface

progress report as well, Papers D–I are primarily a result of the last two years of

my studies. I have contributed comprehensively in both the writing as well as the

research phase of Papers A–B and D–H. Together with Victor Rohde I have written

Papers C and I, and to these we have contributed equally.

The ﬁrst chapter of the dissertation is an introduction, which motivates the use of

Lévy-driven moving averages in the modeling of continuous-time stochastic systems

and discusses the importance of obtaining knowledge of their representations, limit

theorems and certain other properties. The ﬁndings of Papers A–H deliver answers

to many of the questions raised in this discussion, and hence the main results of

these papers will also be highlighted in this chapter. Paper I, however, is an industrial

collaboration with Vestas Wind Systems A/S and concerns estimation of extreme

loads on wind turbines using covariates. Since the details are carefully explained in

the included paper and the overall aim diﬀers from that of Papers A–H, I have chosen

not to address its ﬁndings in the introductory chapter.

My four years of PhD studies have been both challenging and rewarding, and I

owe several people huge thanks for making the journey joyful. First of all, I thank my

main supervisor Andreas Basse-O’Connor for giving me the unique opportunity of

pursuing a PhD degree in a truly inspiring and intellectually stimulating research

environment and for our many fruitful discussions. His support, enthusiasm and

high ambitions have deﬁnitely pushed my limits as a researcher. A special thanks

goes to my co-supervisor Jan Pedersen, with whom I have had uncountably many

conversations spanning from technical details in proofs and general probabilistic

and statistical considerations to an analysis of the outcome of yesterday’s hockey

match. Due to his extraordinary guidance, his trust in my abilities and his positive

mindset, Jan has had a signiﬁcant impact on my development and well-being during

my studies. I feel honored that Andreas and Jan have invested this much time and

eﬀort in me—it exceeds by far what could be expected of a supervisor, and for this I

am deeply grateful.

I would also like to thank my co-author Richard A. Davis from Department of

Statistics, Columbia University, for letting me visit him and his group in New York and

for his interest in my research. Our frequent meetings and his generous hospitality

ensured that I had a constructive and pleasant stay. I thank as well my oﬃce mate

Victor Rohde for numerous fruitful collaborations, and the local L

X expert Lars

‘daleif’ Madsen and my oﬃce mate Mathias Ljungdahl for helping me with the

technical typesetting. Furthermore, I want to thank my colleagues at Department

of Mathematics, Aarhus University, for giving me a perfect working environment,

which I have enjoyed being a part of throughout my studies. A particular thanks goes

to Claudio Heinrich, Julie Thøgersen, Mads Stehr, Mathias Ljungdahl, Patrick Laub,

Thorbjørn Grønbæk and Victor Rohde for all the (non-)mathematical discussions and

social activities.

Finally, my family and friends deserve an abundance of gratitude for their endless

support and encouragement. I conclude with a very special thanks to my ﬁancée

Marianne, since none of this would have been possible without her.

Mikkel Slot Nielsen

Aarhus, July 2019

Summary

Similarly to the discrete-time framework, moving averages driven by white noise

processes play a crucial role in the modeling of continuous-time stochastic processes.

The main purpose of this dissertation is to address various aspects of Lévy-driven

moving averages. The existence of equivalent martingale measures, autoregressive

representations and limit theorems will be of particular interest.

Based on earlier literature on the semimartingale property for Lévy-driven moving

averages, and under rather general conditions on the Lévy process, we give necessary

and suﬃcient conditions on the driving kernel for an equivalent martingale measure

to exist. In particular, these conditions extend previous results for Gaussian moving

averages to the symmetric α-stable case with an arbitrary α ∈ (1, 2].

A signiﬁcant part of the dissertation concerns various properties of solutions to

a range of stochastic delay diﬀerential equations (SDDEs). Among other things, we

obtain suﬃcient conditions for existence and uniqueness of solutions to univariate,

multivariate, higher order and fractional SDDEs, provide moving average represen-

tations of the solutions and discuss its memory properties. A few implications of

the obtained results are that (i) invertible continuous-time ARMA processes can

be viewed as unique solutions to SDDEs, (ii) solutions can be semimartingales and

exhibit long memory at the same time, and (iii) cointegration can be embedded in

multivariate SDDEs in a straightforward manner. From the properties that we prove

for SDDEs we draw several parallels to classical results for autoregressive representa-

tions in the discrete-time literature and, hence, indicate that it may be reasonable to

think of SDDEs as the continuous-time counterpart.

We also study the limiting behavior of quadratic forms and related quantities of

discretely sampled Lévy-driven moving averages. The linear nature of Lévy-driven

moving averages and their tractable probabilistic structure allow us to obtain rather

explicit conditions on the driving kernel and the coeﬃcients of the quadratic form

ensuring asymptotic normality. The result diﬀers from those obtained in related

literature due to the quite delicate interplay between discrete-time sampling and

continuous-time convolutions. The applications of these asymptotic results are many;

in particular, we demonstrate how they can be used to obtain central limit theorems

when estimating the driving kernel parametrically using least squares.

The last part of the dissertation is related to an industrial collaboration, where

we consider prediction of extreme loads on wind turbines using only a number of

covariates and a simulation tool. In particular, we discuss how to set up a statistical

model in this situation, address some of the key assumptions and, ﬁnally, check its

performance on real-world data.

vii

Resumé

Præcis som for modeller i diskret tid spiller glidende gennemsnit drevet af hvid støj

en fundamental rolle i modelleringen af stokastiske processer i kontinuert tid. Hoved-

formålet med denne afhandling er at undersøge forskellige aspekter af Lévy-drevne

glidende gennemsnit. Vi vil være særligt interesserede i eksistens af ækvivalente

martingalmål, autoregressive repræsentationer og grænseværdisætninger.

Baseret på tidligere litteratur om semimartingal-egenskaben for Lévy-drevne

glidende gennemsnit, og under ret svage antagelser på Lévy-processen, giver vi nød-

vendige og tilstrækkelige betingelser på den drivende kerne, der sikrer, at et ækvi-

valent martingalmål eksisterer. Som et specialtilfælde af dette resultat opnår vi en

generalisering af resultater for gaussiske glidende gennemsnit til det symmetrisk

α-stabile tilfælde for et vilkårligt α ∈ (1,2].

En stor del af afhandlingen omhandler forskellige egenskaber ved løsninger til en

række stokastiske diﬀerentialligninger, som involverer processens egen fortid (disse

vil herfra kort refereres til som SDDEer). Vi giver tilstrækkelige betingelser til at

sikre eksistens og entydighed af løsninger til en- og ﬂerdimensionale SDDEer, SD-

DEer af højere orden og fraktionelle SDDEer. Desuden repræsenterer vi løsningerne

som glidende gennemsnit samt studerer deres afhængighedsstruktur. Umiddelbare

konsekvenser af disse resultater er, at (i) invertible ARMA processer i kontinuert tid

er entydige løsninger til SDDEer, (ii) løsninger kan være semimartingaler og have

lang hukommelse på samme tid, og (iii) kointegration kan nemt indlejres i de ﬂer-

dimensionale SDDEer. Fra de beviste egenskaber for SDDEer trækker vi adskillige

paralleller til klassiske resultater for autoregressive modeller i diskret tid og indikerer

på den måde, at SDDEer kan opfattes som modstykket i kontinuert tid.

Vi studerer også den asymptotiske opførsel af kvadratiske former og relaterede

størrelser af diskrete observationer fra Lévy-drevne glidende gennemsnit. De gliden-

de gennemsnits lineære struktur samt transparente fordelingsmæssige egenskaber

gør det muligt for os at udlede eksplicitte betingelser på den drivende kerne og

koeﬃcienterne i den kvadratiske form, som sikrer asymptotisk normalitet. På grund

af det udfordrende samspil mellem diskrete observationer og foldninger i kontinuert

tid adskiller resultatet sig fra dem, der er udledt i lignende litteratur. Sådanne asymp-

totiske resultater har mange anvendelser: f.eks. viser vi, hvordan de kan bruges til at

udlede centrale grænseværdisætninger ifm. parametrisk estimation af den drivende

kerne ved brug af mindste kvadraters metode.

Afhandlingens sidste del er relateret til et industrielt samarbejde, hvor vi studerer

estimation af ekstreme belastninger på vindmøller ved brug af en række kovariater

samt et simuleringsværktøj. Her diskuterer vi, hvordan man kan formulere en fornuf-

tig statistisk model samt belyser de væsentligste antagelser. Endeligt undersøger vi,

hvordan modellen klarer sig på data fra virkeligheden.

Introduction

This chapter motivates the study of Lévy-driven moving average processes, highlights

key results obtained in the included papers and addresses their relation to existing

literature. In Section 1 we discuss why Lévy-driven moving averages constitute a

convenient class for modeling a wide range of stochastic systems in time by relying

on a Wold–Karhunen type decomposition, and we review some of their properties.

This leads naturally to a discussion of the key ﬁndings of Paper A. Section 2 concerns

the speciﬁcation of the deterministic kernel driving the moving average. Speciﬁcally,

by drawing parallels to the discrete-time literature on ARMA type equations, we mo-

tivate the continuous-time ARMA processes as well as solutions to certain stochastic

delay diﬀerential equations. These are all special cases of moving averages, which

have formed the foundations of Papers B–E and G, and hence we end the section

by giving an overview of the main contributions of each of these papers. Finally, in

Section 3 we discuss the relevance of limit theorems for quadratic forms and related

quantities of Lévy-driven moving averages and relate it to Papers F and H.

1 A Wold–Karhunen type decomposition and the Lévy-driven

moving averages

There may be many reasons for modeling stochastic processes continuously in time.

To give an example, ﬁnancial data are nowadays sampled at both very high and

irregular frequencies, and the continuous-time speciﬁcation is a way to model this

type of observations in a consistent manner. Another reason is due to the remarkable

result of Delbaen and Schachermayer [18], which essentially characterizes arbitrage

opportunities in a ﬁnancial market driven by semimartingales in terms of the exis-

tence of a so-called equivalent martingale measure (cf. Paper A). For further examples

on the use of continuous-time models, see [1, 8] and [22, Section 1.2].

Suppose now that (

)

t∈R

is a centered and weakly stationary (continuous-time)

process, that is,

E[X

] < ∞, E[X

] = 0 and



h 7−→E[X

t+h

]





h 7−→E[X

]



C γ

(1.1)

for all

t ∈ R

. While some phenomena may be reasonably described by such (

)

t∈R

one may often need to transform, deseasonalize and/or detrend observations to

align with such assumptions (see [11, Section 1.4] for details on this). A classical

example is the evolution of a stock price (

)

t∈R

exhibiting the random walk behavior

lim

t→∞

Var

(

) =

∞

, while its

∆

-period log-returns

logS

t+∆

−logS

t ∈ R

, might

approximately meet

(1.1)

. A related example is the (log-)prices of two stocks (

)

t∈R

and (

)

t∈R

, which individually may wander widely, but the spread

−S

t ∈ R

Introduction

behaves in a stationary manner. Such a situation can happen if the two stocks are

very similar by nature and one would in this case refer to them as being cointegrated

(see also Paper G). Despite the fact that the class of processes satisfying

(1.1)

is large

and general, Theorem 1.1 shows that these conditions are not far from ensuring that,

up to a term that can be perfectly predicted from the remote past (in an

(

) sense),

they correspond to moving averages driven by white noise processes. In the result it

will be required that (

)

t∈R

is continuous in

(

) or, equivalently,

is continuous

at 0. Under this assumption it follows by Bochner’s theorem that there exists a ﬁnite

and symmetric Borel measure

, usually referred to as the spectral distribution of

)

t∈R

, which has characteristic function γ

(h) =

ihy

(dy), h ∈ R. (1.2)

In the formulation,

refers to the density of the absolutely continuous part of

and sp denotes the L

(P) closure of the linear span.

Theorem 1.1 (Karhunen [28]).

Suppose that (

)

t∈R

is a centered and weakly station-

ary process, which is continuous in

(

). Moreover, suppose that the Paley–Wiener

condition

|logf

(y)|

1 + y

dy < ∞ (1.3)

is satisﬁed. Then there exists a unique decomposition of (X

)

t∈R

−∞

g(t −u) dZ

+ V

, t ∈ R, (1.4)

where

g : R → R

belongs to

, (

)

t∈R

is a process with weakly stationary and orthogonal

increments satisfying

[(

−Z

)

] =

t −s

for all

s < t

, and (

)

t∈R

is a weakly stationary

process with

∈

s∈R

sp{X

u ≤ s}

for

t ∈ R

. Moreover, if

is absolutely continuous

with a density f

satisfying (1.3) then V

= 0 for all t ∈ R.

The stochastic integral in

(1.4)

is deﬁned as an

(

) limit of integrals of simple func-

tions. While the proof of Theorem 1.1 can be found in [28, Satz 5–6], the formulation

of the result is borrowed from [5, Theorem 4.1]. It is straightforward to verify that a

converse of Theorem 1.1 is also true: if

g ∈ L

and (

)

t∈R

is a process with weakly

stationary and orthogonal increments satisfying E[(Z

−Z

)

] = t −s for s < t, then

−∞

g(t −u) dZ

, t ∈ R, (1.5)

satisﬁes

(1.1)

, and

can be represented as

(1.2)

with

(

) = (2

)

−1

[

](

)

Here

[

] denotes the Fourier transform of

; we deﬁne it as

[

](

) =

−ity

(

)

for g ∈ L

, y ∈ R, and extend it to functions in L

∪L

by Plancherel’s theorem.

Loosely speaking, the above considerations show that weakly stationary processes

correspond to causal moving averages of the form

(1.5)

and, thus, it would be natural

to focus on modeling

and (

)

t∈R

. Note that, unless (

)

t∈R

can be assumed to be

Gaussian, in which case (

)

t∈R

is a standard Brownian motion,

(1.5)

does not reveal

anything about (

)

t∈R

beyond its second order properties. In particular, for a general

noise process (

)

t∈R

, the relation

(1.5)

leaves us with no insight about the path

1 · A Wold–Karhunen type decomposition and the Lévy-driven moving averages

properties and the probabilistic structure of (

)

t∈R

. For instance, to assess properties

of estimators based on (

)

t∈R

, it is necessary to have a better understanding of its

dependence structure. This should indicate that, while the overall moving average

(convolution) structure can possibly produce a wide class of interesting processes,

we should require that (

)

t∈R

is a particularly nice process. Natural candidates are

provided by the extensively studied class of Lévy processes ([6, 9, 34]), since these will

allow us to keep track of the entire distribution of the process while maintaining the

same second order properties. Since Lévy processes have stationary and independent

increments, the use of these can be seen as the continuous-time equivalent of using

i.i.d. noise rather than just uncorrelated noise in a discrete-time setting.

Recall that a one-sided Lévy process (

)

t≥0

= 0, is a stochastic process with

càdlàg sample paths having stationary and independent increments. These properties

imply that logE[exp{iyL

}] = tlogE[exp{iyL

}] for y ∈ R. Consequently, since

(y) B logE

iyL

= iyb −

iyx

−1 −iyx1

{|x|≤1}

)F(dx), y ∈ R,

for some

b ∈ R

≥

0 and Lévy measure

by the Lévy–Khintchine formula, the

distribution of (

)

t≥0

may be summarized as a triplet (

b, c

). We extend (

)

t≥0

a two-sided Lévy process (

)

t∈R

by setting

−

(−t)−

for

t <

0, where (

)

t≥0

is an

independent copy of (

)

t≥0

. When

[

]

< ∞

, or equivalently

|x|>1

|x|F

(

)

< ∞

, we

let

= L

−tE[L

], t ∈ R, denote the centered version of (L

)

t∈R

For a measurable function

g : R → R

, which vanishes on (

−∞,

0), necessary and

suﬃcient conditions on (b,c

,F, g) for the Lévy-driven moving average

−∞

g(t −u) dL

, t ∈ R, (1.6)

to exist (as limits in probability of integrals of simple functions) are given in [31,

Theorem 2.7]. It follows as well from [31] that the ﬁnite dimensional distributions of

)

t∈R

are characterized in terms of (b,c

,F, g) by the relation

logE

i(y

+···+y

)

g(t

+ u)+ ···+ y

g(t

+ u)) du,

which holds for any

n ∈ N

and

,... , t

∈ R

. One immediate consequence of

this relation is that (

)

t∈R

is a stationary and inﬁnitely divisible stochastic process

(the ﬁnite dimensional distributions of (

t+h

)

t∈R

are inﬁnitely divisible and do not

depend on

). Note that, in contrast to

(1.5)

, (

)

t∈R

given by

(1.6)

needs not satisfy

(1.1)

, e.g., it may allow for a heavy-tailed marginal distribution. For instance, if

has a symmetric

-stable distribution for some

α ∈

2), then

(1.6)

is well-deﬁned

if and only if

g ∈L

, in which case the distribution of

is also symmetric

-stable

([33, Propositions 6.2.1–6.2.2]). In particular, for

p ∈

,∞

) it holds that

[

]

< ∞

if and only if

p < α

([33, Property 1.2.16]). While the class of Lévy-driven moving

averages is rather large, it should be pointed out that more general speciﬁcations of

stationary inﬁnitely divisible processes, such as mixed moving averages (in particular,

superpositions of Ornstein–Uhlenbeck processes) and Lévy semistationary processes

have also received some attention in the literature; see [3, 4] for details.

The path properties of (

)

t∈R

are very much related to those of

, and for details

beyond the following discussion we refer to [32]. A fundamental question to ask

Introduction

is when (

)

t≥0

is a semimartingale (with respect to a suitable ﬁltration). Indeed,

Delbaen and Schachermayer [18] argue that the semimartingale property is desirable

when modeling ﬁnancial markets, and by Bichteler–Dellacherie theorem it is nec-

essary and suﬃcient that (

)

t≥0

is a semimartingale if it is supposed to serve as a

“good” integrator (see [10, Theorem 7.6] and [19] for precise statements). Under rather

mild conditions on the driving Lévy process (

)

t∈R

, [7, Corollary 4.8] provides a

complete characterization of the semimartingale property within the moving average

framework (1.6):

Theorem 1.2 (Basse-O’Connor and Rosiński [7]).

Suppose that (

)

t∈R

has sample

paths of locally unbounded variation and that either

x 7→ F

((

−x, x

)

) is regularly varying

∞

of index

β ∈

[

−

,−

1) or

|x|>1

(

)

< ∞

. Then (

)

t≥0

deﬁned as in

(1.6)

is a

semimartingale with respect to the least ﬁltration (

)

t≥0

satisfying the usual conditions

and

σ(L

: s ≤ t) ⊆F

, t ≥ 0,

if and only if g is absolutely continuous on [0, ∞) with a density g

satisfying

∞



(t)



|xg

(t)|∧|xg

(t)|



F(dx)



dt < ∞. (1.7)

Furthermore, if (1.7) is satisﬁed, (X

)

t≥0

admits the semimartingale decomposition

= X

+ g(0)



−∞

(s −u) d



ds, t ≥ 0. (1.8)

If Theorem 1.2 is applicable we have that

[

]

< ∞

, and it follows that (

)

t≥0

can be decomposed into a sum of a martingale and an absolutely continuous stochastic

process (in fact, this implies that (

)

t≥0

is a so-called special semimartingale as

deﬁned in [26, Deﬁnition 4.21]). Sometimes, such as when pricing derivatives or

ﬁxed income securities in a ﬁnancial market driven by semimartingales, it might be

important to know if the latter term can be absorbed by a suitable equivalent change

of measure. To be precise, for a given

T ∈

,∞

) one asks if there is a probability

measure Q on F

such that:

(i) For all A ∈ F

, Q(A) > 0 if and only if P(A) > 0.

(ii) Under Q, (X

)

t∈[0,T ]

is a local martingale with respect to (F

)

t∈[0,T ]

Such

is referred to as an equivalent local martingale measure (ELMM) for (

)

t∈[0,T ]

. It

should be mentioned that equivalent or, more generally, absolutely continuous change

of measure for some stochastic processes (such as Markov processes and solutions

to certain stochastic diﬀerential equations) is well-studied; see the introduction of

Paper A for references. While it might be tempting to require that

∞

(with

∞

t≥0

), this is a rather serious restriction. As an example, the probability

measures induced by two homogeneous Poisson processes with diﬀerent intensities

(on the space

([0

,∞

)) equipped with the Skorokhod topology) are equivalent on

for any

T ∈

,∞

) but singular on

∞

, cf. [17, Remark 9.2]. The intuition is that when

one has an inﬁnite horizon, the intensity can be estimated almost surely from the

Poisson process. Consequently, we will return to the question of the existence of an

ELMM for (X

)

t∈[0,T ]

when ﬁxing T ∈ (0,∞).

1 · A Wold–Karhunen type decomposition and the Lévy-driven moving averages

Recall that it is a prerequisite that (

)

t∈[0,T ]

is a semimartingale in order to admit

an ELMM ([26, Theorem 3.13 (Chapter III)]). This means that if (

)

t∈R

is a Lévy

process satisfying the assumptions of Theorem 1.2, the conditions imposed on

this theorem are necessary. Except in trivial cases it must also be the case that

(0)

indeed, if g(0) = 0 and an ELMM Q exists, the representation (1.8) shows that



−∞

(s −u) d



ds, t ∈ [0,T ],

is a local martingale under

, and hence it must be identically equal to zero ([20,

Theorem 3.3 (Section 2)]). If the distribution of

is not degenerate, this happens

only if

is vanishing almost everywhere. On the other hand, Cheridito [14] showed

that if (

)

t∈R

is a Brownian motion (that is,

0 and

F ≡

0), the condition

(0)

combined with the assumptions of Theorem 1.2 are also suﬃcient for the existence of

an ELMM. The main purpose of Paper A has been to establish conditions ensuring

that (X

)

t∈[0,T ]

admits an ELMM beyond the Gaussian setting.

1.1 Paper A

Inspired by the structure of (

)

t∈[0,T ]

(1.8)

, this paper investigates when an ELMM

exists for semimartingales of the form

= L

ds, t ∈ [0,T ],

under the assumption that (

)

t∈[0,T ]

is a predictable process such that

| dt < ∞

almost surely and

[

]

< ∞

. In Theorem 2.1 (Paper A) we give rather explicit suﬃ-

cient conditions for (

)

t∈[0,T ]

to admit an ELMM. Speciﬁcally, each of the following

two statements is suﬃcient:

(i)

The collection (

)

t∈[0,T ]

is tight, each

is inﬁnitely divisible and the corre-

sponding Lévy measures (

)

t∈[0,T ]

meet

sup

t∈[0,T ]

([

−K,K

]

) = 0 for some

K >

0. Moreover, the Lévy measure F of (L

)

t∈[0,T ]

satisﬁes F((−∞,0)),F((0,∞)) > 0.

(ii)

The Lévy measure

of (

)

t∈[0,T ]

satisﬁes

((

−∞,−K

])

([

K,∞

))

0 for all

K >

The somewhat canonical example of a process (

)

t∈[0,T ]

satisfying (i) is a stationary

and inﬁnitely divisible process where the Lévy measure of

is compactly supported.

More concretely, it could be a moving average with a bounded kernel driven by a Lévy

process with a compactly supported Lévy measure. Loosely speaking, (ii) states that

no further assumptions on (

)

t∈[0,T ]

are needed as long as (

)

t∈[0,T ]

can have jumps

of arbitrarily large positive and negative size. As an almost immediate consequence

of these ﬁndings and Theorem 1.2 above, we obtain a quite general result on the

existence of an ELMM for (

)

t∈[0,T ]

given by

(1.6)

; see Theorem 1.2 of Paper A for

details. Among other things, this result implies that if (

)

t∈R

is a symmetric

-stable

Lévy process for some

α ∈

2], then there exists an ELMM for (

)

t∈[0,T ]

if and only

(0)

0 and

is absolutely continuous on [0

,∞

) with a density

which belongs

(cf. Corollary 1.3 of Paper A). Consequently, this result provides a natural extension

of the Gaussian setup studied in [14].

Introduction

It should be stressed that the techniques used in [14] cannot be transferred into

the non-Gaussian setting that we consider in this paper. Speciﬁcally, his proof relies

on a localized version of the Novikov condition by showing that



exp





< ∞ (1.9)

as long as

t −s ∈

,δ

) for a

δ >

0 suﬃciently small. While this can be veriﬁed in

a Gaussian setup, such a requirement is rarely satisﬁed in other situations. In fact,

is inﬁnitely divisible with a non-trivial Lévy measure,

(1.9)

will never

be satisﬁed ([34, Theorem 26.1]). The conditions (i)–(ii) above are instead results

of two alternative and very diﬀerent techniques. Indeed, (i) makes use of a general

predictable criterion of Lépingle and Mémin [29], and (ii) is obtained by carefully

constructing

so that it changes the distribution of the large jumps of (

)

t∈[0,T ]

, but

leaves the jump intensity constant and thereby avoiding ﬁnite explosion times.

2 Dynamic models for Lévy-driven moving averages

While the Lévy-driven moving averages deﬁne a rather ﬂexible and tractable class of

stationary continuous-time processes, we are still left with the question: What are rea-

sonable choices of the kernel

? It may be desirable to choose

so that (

)

t∈R

exhibits a

certain autoregressive (dynamic) behavior. Since autoregressive and moving average

representations have diﬀerent advantages, one would often aim at getting parsimo-

nious representations in both domains without losing too much ﬂexibility—e.g., in

terms of possible autocovariances or, equivalently, spectral distributions that can be

generated by the model.

Motivation: To make the above discussion more concrete, let us take a step back and

consider the discrete-time equations

∞

j=0

t−j

and

∞

j=0

t−j

= ε

, t ∈ Z, (2.1)

for suitable sequences of coeﬃcients (

)

t∈N

and (

)

t∈N

, and an i.i.d. noise (

)

t∈Z

Some choices of (

)

t∈N

lead to a stationary moving average (

)

t∈Z

, deﬁned by

the ﬁrst equation of

(2.1)

, which satisﬁes the second equation of

(2.1)

for a suitable

choice of (

)

t∈N

. Conversely, for some choices of (

)

t∈N

the second equation of

(2.1)

has a unique stationary solution given by the ﬁrst equation of

(2.1)

with a

suitably chosen sequence (

)

t∈N

. We will refer to the ﬁrst and second equation of

(2.1)

as a moving average representation and an autoregressive representation of (

)

t∈Z

respectively. While a moving average representation is convenient for assessing

several distributional properties of (

)

t∈Z

, an autoregressive representation provides

a lot of valuable insight concerning the dynamic behavior of (

)

t∈Z

; e.g., it can be

used for prediction and estimation purposes, to simulate sample paths or to ﬁlter out

the noise (ε

)

t∈Z

from the observed process (Y

)

t∈Z

There is no guarantee that a simple moving average representation leads to a

particularly simple autoregressive representation and vice versa. However, an ex-

tremely popular modeling class in discrete time, which allows for rather tractable

2 · Dynamic models for Lévy-driven moving averages

representations in both domains, consists of the causal and invertible ARMA pro-

cesses. Speciﬁcally, given two real polynomials

and

with no zeroes on the unit

disc

D B {z ∈ C

|z| ≤

}

, the corresponding ARMA process (

)

t∈Z

is the unique

stationary solution to the linear diﬀerence equation

P (B)Y

= Q(B)ε

, t ∈ Z. (2.2)

Here

denotes the backward shift operator. In this case, (

)

t∈N

and (

)

t∈N

corre-

spond to the coeﬃcients in the power series expansion on

of the rational functions

Q/P

and

P /Q

, respectively. The diﬃculty of computing the coeﬃcients depends ulti-

mately on the denominator polynomial, and hence there is a tradeoﬀ between the

simplicity of the moving average and the autoregressive speciﬁcation. One advantage

of the ARMA framework, however, is that the coeﬃcients can always be obtained by

relying on simple properties of the geometric series and, possibly, a partial fraction

decomposition. An easy example is the AR(1) process where

(

) = 1

−αz

for some

α ∈

(

−

1) and

Q ≡

1. In this case

= 1,

−α

and

= 0 for

j ≥

2, while

for all

j ≥

0. There exists a vast amount of literature related to ARMA processes and

various extensions. For further details, see [11, 25].

Continuous-time ARMA equations: Since the coeﬃcients in the moving average

representation of the discrete-time AR(1) process take a geometric form, the con-

tinuous-time equivalent is naturally

(

) =

−λt

for

t ≥

0 and a given

λ >

0. The

corresponding process (

)

t∈R

given by

(1.6)

, known as the Ornstein–Uhlenbeck pro-

cess, is perhaps the most well-studied Lévy-driven moving average of all time, and it

can be characterized as the unique stationary solution to the stochastic diﬀerential

equation

−X

= −λ

du + L

−L

, s < t. (2.3)

Ornstein–Uhlenbeck processes enjoy many properties: they are Markovian, their

possible one-dimensional marginal laws coincide with the self-decomposable distri-

butions and a sampled Ornstein–Uhlenbeck process (

t∆

)

t∈Z

is an AR(1) process for

any

∆ >

0. For details about Ornstein–Uhlenbeck processes and further references,

see Section 1 of Paper B.

Deﬁning formally the derivatives (

)

t∈R

and (

)

t∈R

of (

)

t∈R

and (

)

t∈R

respectively,

(2.3)

reads (

)

for

t ∈ R

. In light of this equation and

(2.2)

makes sense to view a process (

)

t∈R

as a continuous-time ARMA (CARMA) process

if it is stationary and satisﬁes the formal equation

P (D)X

= Q(D)DL

, t ∈ R, (2.4)

for two real polynomials

and

. Although the derivatives on the right-hand side

will not be well-deﬁned in the usual sense (except in trivial cases), (

)

t∈R

is deﬁned

rigorously through its corresponding moving average representation. Speciﬁcally, by

assuming that

deg

(

) and

deg

(

) satisfy

p > q

and that

has no zeroes on

{z ∈ C

(

)

≥

}

, there exists a function

g : R → R

which vanishes on (

−∞,

0) and

has Fourier transform

F [g](y) =

Q(iy)

P (iy)

, y ∈ R.

Introduction

As for the ARMA processes, the rational form of the Fourier transform ensures

that one can compute

explicitly by relying on the fact that

t 7→ 1

[0,∞)

(

)

−λt

has

Fourier transform

y 7→

(

)

−1

for any

λ >

0. This construction ensures that

absolutely continuous on [0

,∞

) and decays exponentially fast at

∞

, and hence the

causal CARMA(

p, q

) process with polynomials

and

can be rigorously deﬁned

as the moving average

(1.6)

with kernel

as long as

[

log

]

< ∞

. On a heuristic

level, one can apply the Fourier transform on both sides of the equation

(2.4)

and

rearrange terms in order to reach the conclusion that a CARMA process should have

such a moving average representation. For applications and properties of the CARMA

process as well as details about its deﬁnition, see Sections 1 and 4.3 of Paper D and

references therein.

Continuous-time autoregressive representations: To sum up, the continuous-time ver-

sion of the moving average representation in

(2.1)

is the Lévy-driven moving average

(1.6)

, and the ARMA equation

(2.2)

may naturally be interpreted as

(2.4)

, which in

turn leads to the CARMA processes that have a fairly tractable kernel

. Still, when

comparing to the discrete-time setup, some questions arise immediately:

(i) What is an autoregressive representation in continuous time?

(ii) Which types of moving averages admit such a representation?

(iii)

Does the CARMA process admit an autoregressive representation and is it particu-

larly simple?

Suppose that

[

] = 0 and

[

]

< ∞

. For a process (

)

t∈R

with

[

] = 0 and

[

]

< ∞

to admit an autoregressive representation it seems reasonable to require

that

sp{X

: u ≤ t} ⊇ sp{L

−L

: s ≤ t}, t ∈ R. (2.5)

When (

)

t∈R

is of the moving average form

(1.6)

for some

g ∈L

which is vanishing

on (

−∞,

0), the reverse inclusion of

(2.5)

is always satisﬁed and equality holds if

and only if

[

] is a so-called outer function ([21, pp. 94–95]). While there exist

conditions ensuring that a function is outer, these are often not easy to check and,

more importantly, in many situations the recipe for going from (

)

u≤t

−L

is not

clear. Instead, we take the opposite standpoint and deﬁne a class of processes by an

autoregressive type of equations, such that this transition is simple and transparent.

Of course, then we need to argue that it contains a suﬃciently wide class of mov-

ing averages—ideally, to align with the discrete-time representations, the invertible

CARMA processes should form a particularly nice subclass. The relation between

this class of autoregressions and moving averages should be somewhat as depicted in

Figure 1.

The class of interest will be solutions to the so-called stochastic delay diﬀeren-

tial equations (SDDEs), which in the simplest case (univariate, ﬁrst order and non-

fractional) are of the form

−X

[0,∞)

u−v

η(dv) du + L

−L

, s < t. (2.6)

Here

is a ﬁnite signed measure and (

)

t∈R

is a measurable process such that the

integral in

(2.6)

is well-deﬁned almost surely for each

s < t

. Among other things, the

2 · Dynamic models for Lévy-driven moving averages

Autoregressions

Moving Averages

CARMA

Figure 1:

Invertible and causal CARMA processes being a strict subset of processes which both admit an

autoregressive representation and a moving average representation.

purpose of Papers B–E and G has been to address each of the questions (i)–(iii) in

frameworks related to

(2.6)

and show that many properties of the solutions are akin to

those of discrete-time autoregressions. Depending on the paper, diﬀerent assumptions

are put on (

)

t∈R

in order to ensure that the integral in

(2.6)

is well-deﬁned. For

now let us just remark that each of the following three conditions is suﬃcient: (i)

compactly supported and

t 7→ X

is càdlàg, (ii) (

)

t∈R

is stationary and

[

]

< ∞

and (iii) (X

)

t∈R

has stationary increments, E[|X

|] < ∞ for all t and

[0,∞)

t |η|(dt) < ∞

(the latter condition is due to [5, Corollary A.3]). One of the simplest SDDEs is the

Ornstein–Uhlenbeck equation

(2.3)

, which corresponds to

−λδ

with

being

the Dirac measure at 0. The literature has primarily focused on the case where

compactly supported (cf. [24, 30]), but as we shall see in Paper D, this restriction

unfortunately rules out the possibility of representing CARMA processes with a

non-trivial moving average polynomial as solutions to SDDEs. To the best of our

knowledge, SDDEs have historically not been viewed as continuous-time equivalents

to discrete-time autoregressive representations, and hence questions such as (i)–(iii)

have not been raised.

Before jumping into technical descriptions of the attached papers on SDDEs,

we will brieﬂy comment on their scopes. Papers B and D address existence and

uniqueness of stationary solutions to

(2.6)

, also when the noise is much more general

than (

)

t∈R

, and in Paper D the results are shown to hold true in a multidimensional

and higher order setting as well. Moreover, Paper E deﬁnes a large class of fractional

delays which all give rise to stationary solutions that are semimartingales and have

hyperbolically decaying autocovariance functions. While the equations considered

in this paper do indeed take the form

(2.6)

in special cases, the general framework

is diﬀerent and speciﬁcally tailored for producing long-memory processes. Finally,

Paper G studies existence and uniqueness of solutions which are not necessarily

stationary, but have stationary increments, in the same type of multivariate setting as

in Paper D, and it characterizes the space of the corresponding cointegration vectors.

In general, the papers draw clear parallels to well-known discrete-time models such

as the fractionally integrated ARMA model and the cointegrated VAR model.

Besides whether we consider a univariate or multivariate version of

(2.6)

, there is

another factor discriminating the papers: to ﬁnd solutions to

(2.6)

using Papers B–D

we must have that

([0

,∞

))

0, while Papers E and G sometimes apply in cases

where

([0

,∞

)) = 0. The condition

([0

,∞

)) = 0 corresponds to the autoregressive

polynomial having a zero at

= 1 in a discrete-time setting, and it is closely related

Introduction

to memory and stationarity properties of the solution. Table 1 gives an overview of

the focus in each of the ﬁve papers on SDDEs.

Table 1: An overview of the ﬁve papers on SDDEs.

Univariate Multivariate

η([0,∞)) , 0 B, C D

η([0,∞)) = 0 E G

2.1 Papers B and D

Papers B and D are very much related in the sense that the latter extends the former

to a multivariate framework, and questions such as existence and uniqueness of

stationary solutions are addressed in both papers. Despite this, they still have fairly

diﬀerent aims:

(i)

Paper B also contains a study of an alternative type of autoregressive represen-

tation than the SDDE and many examples are provided.

(ii)

Paper D is generally more technical and is also concerned with representations

of solutions, prediction formulas, higher order SDDEs and their relation to

invertible CARMA processes.

Here we will brieﬂy discuss the main ﬁndings of the two papers, but only formulate

them in the univariate setting. The multivariate extension is more demanding from

a notational point of view and, thus, we refer to Paper D for further details. The

majority of the proofs in Papers B and D rely on the idea of rephrasing the problems

in the frequency domain and then exploiting key results from harmonic analysis,

such as certain Paley–Wiener theorems and characterizations of Hardy spaces, to

establish the existence of the appropriate functions.

The equation of interest is

(2.6)

with a more general noise and of higher order,

namely

(m−1)

−X

(m−1)

m−1

j=0

[0,∞)

(j)

u−v

(dv) du + Z

−Z

, s < t. (2.7)

where (

)

t∈R

is a measurable process with stationary increments,

= 0 and

[

]

∞

for all

t ∈ R

. Here

m ∈ N

, the measures

,η

,... , η

m−1

are ﬁnite and signed, and

(

(j)

)

t∈R

denotes the

th derivative of (

)

t∈R

with respect to

. For convenience, we

will assume that (

)

t∈R

is a regular integrator in the sense of Proposition 4.1 (Paper D).

For now it suﬃces to know that a regular integrator ensures that the solutions we

construct can be expressed as moving averages and that Lévy processes, fractional

Lévy processes and many semimartingales with stationary increments are regular

integrators. It should be stressed that existence and uniqueness of solutions to

(2.7)

can still be obtained when (

)

t∈R

is not a regular integrator; see Theorem 2.5 of

Paper B and Theorem 3.1 of Paper D for the case m = 1.

As discussed in relation to Table 1, we need to impose conditions ensuring that

m−1

j=0

([0

,∞

))

0 in order to prove existence and uniqueness of stationary solutions

2 · Dynamic models for Lévy-driven moving averages

(2.7)

. Speciﬁcally, it is assumed that

[0,∞)

|η

(

)

< ∞

for

= 0

,... , m −

1 and

that the equation

(z) := z

−

m−1

j=0

[0,∞)

−zt

(dt) = 0 (2.8)

has no solutions on the imaginary axis

{z ∈ C

(

) = 0

}

. Here

|η

denotes the

variation of

. Theorem 4.5 (Paper D) states that, under these assumptions, the

unique stationary solution to (2.7) is given by

g(t −u) dZ

, t ∈ R. (2.9)

where

g : R → R

can be characterized through its Fourier transform as

[

](

) =

(

)

−1

for

y ∈ R

. Note that

[

] is well-deﬁned due to the imposed assumption

. Here uniqueness means that for any other measurable and stationary process

(

)

t∈R

which has

[

]

< ∞

and satisﬁes

(2.7)

, the equality in

(2.9)

holds true

almost surely for each t ∈ R. It follows that (X

)

t∈R

is a backward moving average of

the form

(1.5)

is vanishing on (

−∞,

0) almost everywhere, and this is the case if

the equation in (2.8) has no solutions on {z ∈ C : Re(z) ≥ 0}.

The last result addressed here concerns the possibility of representing CARMA

processes as unique solutions to certain SDDEs. Hence, we consider any two real and

monic polynomials

and

with corresponding degrees

p > q

, and we assume that

has no zeroes in

{z ∈ C

(

) = 0

}

and does not share any zeroes with

. Moreover, we

let (

)

t∈R

be given by

(2.9)

with

[

](

) =

(

)

(

) for

y ∈ R

. This setup covers in

particular the causal Lévy-driven CARMA process introduced above when

[

]

< ∞

but also more general CARMA frameworks as discussed in Section 4.3 (Paper D). In

line with discrete-time ARMA processes we need an invertibility assumption in order

to obtain an autoregressive representation, and this amounts in turn to assuming that

the zeroes of

do not belong to

{z ∈ C

(

)

≥

}

. Note that this is exactly what is

needed for

to be outer (see [21, Exercise 2 (Section 2.7)]), which is necessary and

suﬃcient for

(2.5)

to hold when

[

] = 0 and

[

]

< ∞

. While the rational function

P /Q

was the key ingredient in order to obtain an autoregressive representation of

ARMA processes in a discrete-time setup, the continuous-time SDDE setup requires

a decomposition of P . Speciﬁcally, we decompose P as

P = QR + S,

where

and

are polynomials such that

deg

(

) =

p −q

and

deg

(

)

< q

(

S ≡

0 if

= 0).

Such a decomposition is unique and can be obtained using polynomial long division.

Set m = p −q and write

R(z) = z

−c

m−1

−···−c

, z ∈ C,

for suitable

,... , c

m−1

∈ R

. The essence of Theorem 4.8 (Paper D) is that (

)

t∈R

the unique stationary solution to (2.7) when

(dt) = c

(dt) + f (t) dt and η

= c

, j = 1,..., m −1, (2.10)

where

f : R → R

is vanishing on (

−∞,

0) and characterized by

[

](

) =

(

)

(

)

for

y ∈ R

. One should notice here that, similarly to computing the coeﬃcients in the

Introduction

autoregressive representation of an ARMA process, writing up the SDDE associated

to a particular CARMA process reduces to ﬁnding a function with a certain rational

Fourier transform.

2.2 Paper C

Inspired by the study of Brockwell et al. [12], the purpose of this paper is to carry

out a simulation study, which is designed to check the possibility of using SDDEs

to ﬁlter out (or recover) the noise process from an observed invertible Lévy-driven

CARMA(2

1) process (

)

t∈R

. Speciﬁcally, the results of Papers B and D ensure the

existence of α,β ∈ R and γ > 0 such that

= αX

dt + β

∞

−γu

t−u

du dt + dL

, t ∈ R, (2.11)

so by observing (

)

t∈R

on a suﬃciently ﬁne grid the distribution of

is estimated

by discretizing

(2.11)

. Before this step we estimate the vector (

α,β,γ

) of parameters

by a least squares approach. We refer to Sections 3 and 4 (in particular, Figures 2

and 3) of Paper C for further details.

2.3 Paper E

This paper is concerned with the question of incorporating long memory into the

solutions of equations of a similar type as the SDDE in

(2.6)

when

[

] = 0 and

[

] = 1. The notion of long memory refers in this context to a certain asymptotic

behavior of either the autocovariance function

or, if it exists, the spectral density

of the solution (X

)

t∈R

, namely that

(h) ∼ αh

2β−1

as h → ∞ or f

(y) ∼ α|y|

−2β

as y → 0 (2.12)

for some

α >

0 and

β ∈

2). Here, and in what follows, we use the notation

(

)

∼ g

(

) for two functions

f ,g : R → C

to indicate that

(

)

(

)

→

1 for

tending

to some speciﬁed limit. By a Tauberian argument, the two conditions in

(2.12)

are

equivalent under suitable regularity conditions. Recall that, under the assumptions

of Papers B and D, the unique solution to

(2.6)

is a moving average driven by (

)

t∈R

with a kernel

satisfying

[

](

) = (

iy −F

[

](

))

−1

for

y ∈ R

. It is not too diﬃcult

to verify that

g ∈ L

∩L

(Lemma 2.2 of Paper B) and

(

) = (2

)

−1

|iy −F

[

](

)

−2

(Plancherel’s theorem), and hence the solution does not possess any of the properties

in (2.12).

The general equation considered in Paper E is

−X

−∞



−

(s,t]



(u)

[0,∞)

u−v

η(dv) du + L

−L

, s < t, (2.13)

where



−

(s,t]



(u) =

Γ (1 −β)

(t −u)

−β

−(s −u)

−β

, u ∈ R,

is the right-sided Riemann–Liouville fractional derivative of the indicator function

(s,t]

. While solutions to

(2.13)

may indeed be viewed as solutions to

(2.6)

in some

2 · Dynamic models for Lévy-driven moving averages

cases (see Example 4.5 of Paper E),

(2.13)

is generally better suited for studying long-

memory processes. To motivate this statement, note that while both

(2.6)

and

(2.13)

can be written as

−X

∞

t−u

t−s

(du) + L

−L

, s < t, (2.14)

for a suitable family of ﬁnite measures (

)

h>0

, it can be checked that, as

y →

0 and for

each

h >

[

](

)

∼ hη

([0

,∞

)) in the former case and

[

](

)

∼ hη

([0

,∞

))(

)

the latter case. When also keeping in mind that the autoregressive coeﬃcients (

)

j∈N

of discrete-time fractional (ARFIMA type) processes satisfy

∞

j=0

−ijy

∼ α

(

)

y →

0 for some

α >

0 (see, e.g., [11, Section 13.2]), this should indicate that

(2.13)

might be well-suited for the construction of long-memory processes.

In order to show existence and uniqueness of solutions to

(2.13)

it is assumed that

[0,∞)

t |η|(dt) < ∞ and that the equation

η,β

(z) := z

1−β

−

[0,∞)

−zt

η(dt) = 0

has no solution

z ∈ C

with

(

)

≥

0. Here we deﬁne

iγθ

, where

r >

0 and

θ ∈

(

−π, π

] correspond to the polar representation

iθ

z ∈ C \{

}

. Theorem 3.2

(Paper E) shows that these assumptions are suﬃcient to ensure that the unique

solution to

(2.13)

is a backward moving average of the form

(1.6)

with

[

](

) =

(

)

−β

η,β

(

)

−1

for

y ∈ R

. The notion of uniqueness is, however, weaker than in the

non-fractional setting considered in Papers B and D; it is the only stationary process

(

)

t∈R

which satisﬁes

(2.13)

, and which is purely non-deterministic in the sense that

E[X

] = 0, E[X

] < ∞ and

t∈R

sp{X

: s ≤ t} = {0}.

Note that if

((0

,∞

)) = 0 for all

h >

(2.14)

reveals immediately that translations

of solutions remain solutions, and hence we cannot have the same strong type of

uniqueness as in Papers B and D. Proposition 3.7 (Paper E) shows that the model

generates exactly the type of long memory behavior that we asked for in (2.12):

(h) ∼

Γ (1 −2β)

Γ (β)Γ (1 −β)η([0,∞))

2β−1

as h → ∞

and f

(y) ∼

η([0,∞))

|y|

−2β

as y → 0.

An interesting feature of generating long memory processes in this way is that,

in contrast to the long memory models in continuous time which are based on

a fractional noise, the local path properties do not depend on

and (

)

t≥0

is a

semimartingale (see Remarks 3.9 and 3.10 as well as the comment in relation to

Proposition 3.6 of Paper E). Based on the close relation between CARMA processes

and SDDEs with a certain type of delay (cf.

(2.10)

), this subclass is studied in detail

and related to the fractionally integrated CARMA processes introduced in [13].

While the proofs of the paper do indeed make use of some of the same type

of results as in Papers B and D, theory from fractional calculus as well as spectral

representations of stationary processes also play a signiﬁcant role.

Introduction

2.4 Paper G

In Papers B and D it was argued that, under some additional assumptions, a unique

stationary solution to

(2.6)

exists if

([0

,∞

))

0, and Example 4.5 of Paper E shows

that a stationary solution can sometimes exist even when

[

](

)

∼ α

(

)

y →

0 for

some

α >

0 and

β ∈

2). But what happens if the convergence

[

](

)

→

0 is fast?

An extreme example is

η ≡

0, where a stationary solution to

(2.6)

cannot exist unless

(

)

t∈R

is identically zero. A more moderate example could be

(

) = (

)(

)

with

f ,Df ∈ L

. To be able to ﬁnd solutions in such situations it seems reasonable to allow

that a solution is not stationary, but only has stationary increments. In the literature

([16]), a process with these characteristics is often referred to as being integrated (of

order one).

The purpose of Paper G is to study solutions to SDDEs which are possibly in-

tegrated and, in the multivariate setting, cointegrated. Cointegration refers to the

phenomenon that an

-dimensional process (

)

t∈R

is integrated, but (

)

t∈R

stationary for some cointegration vector

β ∈ R

}

. In the paper we prove a Granger

type representation theorem, which characterizes the class of integrated solutions

to SDDEs under appropriate assumptions. This representation reveals in particular

that increments of solutions are uniquely determined, but the possible translations as

well as the number of linearly independent cointegration vectors are tied to the rank

([0

,∞

)). Such type of results should indeed indicate that the ﬁndings of Paper G

are particularly interesting in the multidimensional setting—in fact, several parallels

can be drawn to the celebrated cointegrated VAR model in this case. However, to

avoid introducing too much notation and to agree with the level of details given in

the above descriptions of Papers B–E, we only formulate the results in the univariate

case. The reader is encouraged to consult Paper G (in particular, its introduction) for

further details.

The interest will speciﬁcally be on stochastic processes (

)

t∈R

with the following

properties:

(i) (X

)

t∈R

is measurable and E[X

] < ∞ for all t ∈ R.

(ii) (X

)

t∈R

has stationary increments.

(iii) (X

)

t∈R

satisﬁes the SDDE

−X

[0,∞)

u−v

η(dv) du + Z

−Z

, s < t.

Properties (i) and (iii) implicitly impose the assumption that

[

]

< ∞

for all

t ∈ R

The results obtained in Paper G are based on the assumptions that

[0,∞)

δt

|η|

(

)

< ∞

for some δ > 0 and that the equation

(z) B z −

[0,∞)

−zt

η(dt) = 0

has no solutions z ∈ C \{0} with Re(z) ≥0. Suppose also that

η([0,∞)) = 0 and C

B 1 +

∞

η((t, ∞)) dt , 0. (2.15)

2 · Dynamic models for Lévy-driven moving averages

Under these assumptions, one of the main results in the paper states that a process

)

t∈R

satisﬁes (i)–(iii) above if and only if

= ξ + C

−∞

C(t −u) dZ

, t ∈ R, (2.16)

for some

ξ ∈ L

(

) and with

C : R → R

characterized by

[

](

) =

(

)

−1

−C

(

)

−1

for

y ∈ R

(cf. Theorem 1.2 and Corollary 3.7 of Paper G). This shows that solutions

can always be decomposed into an initial value, a “random walk” and a moving

average—and that the last two of them are uniquely determined by

. As in Papers B

and D the result can also be formulated without assuming that (

)

t∈R

is a regular

integrator (cf. Theorem 3.5 of Paper G). Based on this result and the relation between

SDDEs and stationary invertible CARMA processes, we discuss how one can deﬁne

(co)integrated CARMA processes as solutions to certain SDDEs. Although a detailed

analysis of the multivariate setting will not be presented here, it should be mentioned

that in this situation, the solution will still admit the representation

(2.16)

with

being a deterministic

n×n

matrix,

a deterministic function with values in the space

n ×n

matrices and

a random vector which belongs to the column space of

Consequently, if

β ∈ R

satisﬁes

= 0, it follows from

(2.16)

that (

)

t∈R

is a

moving average and, thus,

is a cointegration vector. Theorem 1.2 of Paper G reveals

that the space of such vectors coincides with the row space of η([0,∞)).

The ideas in the proofs are again based on attacking problems in the spectral

domain by relying on Hardy space theory and spectral representations of stationary

processes. By heuristically applying the Fourier transform to the equation

(2.6)

, one

easily arrives at the conclusion that

F [X](y) = h

(iy)

−1

F [DZ](y), y ∈ R. (2.17)

(

)

0 for all

y ∈ R

as in Papers B and D, there exists

g ∈L

with

[

](

) =

(

and (2.17) indicates that (X

)

t∈R

should take the form (2.9). Now, when η([0,∞)) = 0

the assumption that

|η|

integrates

t 7→ e

δt

allows us to use the machinery of complex

analysis to study the pole of 1

at 0. Although it could be of any order

m ∈ N

, the

second assumption of

(2.15)

ensures that

= 1 (the pole is simple). It is not too

diﬃcult to see that

is the residue of 1

at 0 and that, up to the discrepancy term

(2.16)

aligns with

(2.17)

. Loosely speaking,

determines the order of integration

of the solution, and hence the two assumptions of

(2.15)

result in solutions which are

non-stationary, but have stationary increments.

It should be stressed that, since (

)

t∈R

is not necessarily a Lévy process and could

in principle be stationary, (

)

t∈R

given by

(2.16)

can also be stationary, and hence

it might be misleading to call it “integrated”. To make the deﬁnitions independent

of the stationary properties of (

)

t∈R

, one can rely on the particular framework and

deﬁne an integrated process to be a stochastic process which satisﬁes

−X

[g(t −u) −g(−u)] dZ

and

[g(t + u) −g(u)] du , 0

for all

t ,

0 and a suitable function

g : R → R

with (

u 7→ g

(

)

−g

(

))

∈ L

∩L

for

t >

0. This strategy is well-known in the discrete-time literature (cf. [27, Deﬁnition 1]).

Introduction

3 Limit theorems for quadratic forms and related quantities of

Lévy-driven moving averages

Sections 1–2 introduced Lévy-driven moving averages of the form

(1.6)

as a conve-

nient class to model continuous-time stationary processes and discussed important

subclasses such as CARMA processes and solutions to certain SDDEs. The next task

could naturally concern estimation within one of these particular subclasses. In the

continuous-time framework one often distinguishes between three types of regimes,

namely low, mixed and high frequency. Since Papers F and H consider only the low

frequency setting, this will be our focus in the following section.

Consider a sample

(

) := [

,... , Y

]

from a discrete-time stationary process

(

)

t∈N

, from which we want to infer a parameter

belonging to some set

. The

attention will be restricted to parametric estimation as we will assume that

is a

compact subset of

for some

d ∈ N

. For instance, if (

)

t∈N

is the ARMA process

satisfying

(2.2)

for some

and

, one could be interested in estimating the coeﬃcients

of these polynomials as well the variance

[

] of the innovations. It could also

be the case that the observations stem from an underlying continuous-time process,

e.g.,

= X

t∆

, t ∈ N, (3.1)

for some

∆ >

0 where (

)

t∈R

is a CARMA process, a solution to an SDDE or, more

generally, a moving average. Many parametric estimators based on

(

) can be

characterized as

∈ argmin

θ∈Θ

(θ) (3.2)

for a suﬃciently regular objective function

(

;

(

)). To study second order

asymptotic properties of

n → ∞

it is important to establish a limit theorem for

a suitably scaled version of the ﬁrst order derivative

(

) of

. For a wide

range of popular choices of

, such as squared linear prediction errors, the negative

Gaussian likelihood and Whittle’s approximation,

(

) is closely related to the

sample autocovariances of Y (n) as well as quantities of the form

t,s=1

b(t −s)Y

, n ∈ N, (3.3)

where b : Z → R is an even function (see the introduction of Paper F for details).

One general way to prove limit theorems for such quantities is to impose strict

assumptions on the dependence structure of (

)

t∈N

, e.g., rapidly decaying mixing

coeﬃcients. Besides that such conditions are often too restrictive, they are very

diﬃcult to verify in many situations (see the discussion in [2]). Instead, by assuming a

certain structure of (

)

t∈N

, it is possible to analyze the quantities directly and prove

limit theorems, even in cases where the autocovariance function is slowly decaying.

For instance, if (

)

t∈N

is a discrete-time moving average as in the ﬁrst equation

(2.1)

, one can give precise conditions on

, the moving average coeﬃcients (

)

t∈N

and the noise (

)

t∈Z

to ensure that the sample autocovariances and the quadratic

form

are asymptotically Gaussian (see [11, Section 7] and [23]). The situation

where (

)

t∈N

is given by

(3.1)

for some Lévy-driven moving average (

)

t∈R

is only

partly covered. Indeed, asymptotic results concerning the sample autocovariances

Limit theorems for quadratic forms and related quantities of Lévy-driven moving averages

are established ([15, 35]), but results on the asymptotic behavior of

have been

missing.

3.1 Paper F

The main purpose of this paper has been to give general suﬃcient conditions on

and (L

)

t∈R

to ensure that

−E[Q

]

√

−−−→N (0, η

), n → ∞, (3.4)

when (

)

n∈N

is given by

(3.3)

and

(

t∆ −u

)

for

t ∈ N

. Here

(

ξ,η

) is

the Gaussian distribution with mean

ξ ∈ R

and variance

0, and

−−−→

denotes

convergence in law. The tricky part of studying the limiting behavior of

is that

it involves a double sum. To succeed we followed a strategy similar to that of [23],

which goes by ﬁrst proving a result of the type (3.4) with Q

replaced by

t=1

(t∆ −u) dL

, n ∈ N, (3.5)

and next approximating

with a clever choice of

and

. Note that a

special case of

(3.5)

is the sample autocovariance of moving averages (assuming the

mean is known to be zero), and therefore the limiting behavior of

(

) as

n → ∞

can sometimes also be determined by relying on results for quantities of the same

form as

(see Examples 3.3 and 3.4 of Paper F for details). This means that results

concerning (

)

n∈N

may be of independent interest, and hence we will discuss central

limit theorems for both (S

)

n∈N

and (Q

)

n∈N

here.

Throughout the paper it is assumed that

[

] = 0 and

[

]

< ∞

. The most

general result (Theorem 3.1 of Paper F) obtained for (S

)

n∈N

is that

−E[S

]

√

−−−→N (0, η

), n → ∞, (3.6)

for some η

> 0, if g

and g

satisfy the following conditions:

(S1)

(t)g

(t + · ∆)|dt ∈ `

for i = 1,2 and α

,α

∈ [1,∞] with 1/α

+ 1/α

= 1.

(S2)

(t)g

(t + · ∆)|dt ∈ `

(S3)



t 7−→ kg

(t + · ∆)g

(t + · ∆)k



∈ L

([0,∆]).

As usual,

denotes the space of sequences (

)

t∈Z

satisfying

t∈Z

< ∞

when

p ∈

,∞

) and

sup

t∈Z

| < ∞

when

∞

, and

k · k

is the corresponding norm.

Condition (S3) is not needed if the fourth cumulant

[

]

−

[

]

is zero

or, equivalently, if (

)

t∈R

is a Brownian motion. A simple suﬃcient condition for

(S1)–(S3) to hold is that

(S*) g

∈ L

and

sup

t∈R

|t|

(

)

| < ∞

for

= 1

2 and some

,α

∈

1) with

+ α

> 3/2.

The general result for (

)

n∈N

(Theorem 3.5 of Paper F) establishes that

(3.4)

holds

true under the conditions below:

Introduction

(Q1)

There exist

α,β ∈

,∞

] with 1

/α

+ 1

/β

= 1, such that

(

)

(

· ∆

)

| dt ∈ `

and

(|b|? |g|)(t)(|b|? |g|)(t + · ∆) dt ∈ `

(Q2)

|g(t)|(|b|? |g|)(t + · ∆) dt ∈ `

(Q3)



t 7−→ kg(t + · ∆)(|b|? |g|)(t + · ∆)k



∈ L

([0,∆]).

In the statements above we have used the notation

a ? f

(

) =

s∈Z

(

)

(

t −s∆

) for

functions

a: Z → R

and

f : R → R

(and

t ∈ R

such that the sum is meaningful). Again,

condition (Q3) can be discarded if

= 0. In this setup an easy-to-check condition,

which implies (Q1)–(Q3), is

(Q*) g ∈L

and, for some α,β > 0 with α + β < 1/2,

sup

t∈R

|t|

1−α/2

|g(t)| < ∞ and sup

t∈Z

|t|

1−β

|b(t)|< ∞.

In Theorems 1.1 and 1.2 (Paper F), (Q*) and (S*) can be found together with other

suﬃcient conditions.

It is not too diﬃcult to see that conditions (Q1)–(Q3) are slightly stronger than

(S1)–(S3) with

and

b ? g

. In fact, the key step in showing

(3.4)

under

assumptions (Q1)–(Q3) is to use

(3.6)

and argue that

Var

(

−S

)

/n →

0 as

n → ∞

with this particular choice of

and

. The proof concerning (

)

n∈N

involves two

steps, namely to show that (i) the result is true when adequately truncating

and

by using a central limit theorem for

-dependent sequences, and (ii) it remains

true when passing to the limit. The conditions (S1)–(S3) and (Q1)–(Q3) should

indicate the rather delicate interplay between the discrete-time sampling scheme

and the continuous-time convolution structure of the moving average. Speciﬁcally,

the assumptions concern either the integrability of certain sums or summability of

convolutions. To obtain easy-to-check conditions as given in Theorems 1.1 and 1.2 of

Paper F (in particular, (S*) and (Q*)) it was necessary to prove a suitable Young type

inequality in this mixed framework (Lemma 4.3 of Paper F). Among other things,

this inequality was used to prove that (S*) implies (S1)–(S3) and that (Q*) implies

(Q1)–(Q3).

3.2 Paper H

This paper demonstrates how to use the results of Paper F to obtain asymptotic

normality of a certain type of least squares estimator. Speciﬁcally, it is assumed that

)

t∈N

is of the form

g(t∆ −u) dL

, t ∈ N, (3.7)

for some Lévy process (

)

t∈R

satisfying

[

] = 0 and

[

]

< ∞

and kernel

belong-

ing to a suitable parametrized class of functions

θ ∈ Θ} ⊆ L

. For simplicity, it is

also assumed that

[

] = 1 and, to avoid trivial cases, the set of

such that

(

)

is not a Lebesgue null set. The aim is to estimate the vector

∈ Θ

with the property

g = g

from the sample Y (n) using the estimator

in (3.2) with

(θ) =

t=k+1

−π

;θ)), θ ∈ Θ,

References

for some

k ∈ N

with

k < n

. Here

(

;

) is the

(

) projection of

onto the linear

span of Y

t−1

,... , Y

t−k

computed under the model (3.7) with g = g

It is assumed that the functions

θ 7−→

(j∆ + t)g

(t) dt, j = 0,1,...,k,

are twice continuously diﬀerentiable on the interior of

and an identiﬁability con-

dition as well as a full rank condition are also imposed (see Condition 2.1(a)–(b) of

Paper H). As already mentioned, the typical condition to impose in addition to those

mentioned above is that the sequence (

)

t∈N

exhibits a particular mixing behavior.

Due to the form of

and the moving average structure of (

)

t∈N

we rely instead on

the ﬁndings of Paper F and impose a condition directly on the driving kernel g:

(∗) The function t 7−→

s∈Z

(|g(t + s∆)|

4/3

+ |g(t + s∆)|

) belongs to L

([0,∆]).

A suﬃcient condition for (

∗

) to be satisﬁed is that

g ∈L

and

sup

t∈R

|t|

(

)

| < ∞

for

some

β ∈

1). Under these conditions, and provided that

belongs to the interior

, strong consistency and asymptotic normality of

is established (Theorem 2.4

of Paper H) using standard arguments. The quality of the estimator is assessed

through a simulation study in two concrete cases: (i)

belongs to the class of gamma

kernels, and (ii)

t∆

where (

)

t∈R

is the stationary solution to an SDDE with

delay measure

αδ

βδ

. See Examples 3.1 and 3.2 of Paper H for further details.

References

[1]

Andersen, T.G. (2000). Some reﬂections on analysis of high-frequency data. J.

Bus. Econom. Stat. 18(2), 146–153.

[2]

Ango Nze, P., P. Bühlmann and P. Doukhan (2002). Weak dependence beyond

mixing and asymptotics for nonparametric regression. Ann. Statist. 30(2), 397–

430. doi: 10.1214/aos/1021379859.

[3]

Barndorﬀ-Nielsen, O.E. (2000). Superposition of Ornstein–Uhlenbeck type

processes. Teor. Veroyatnost. i Primenen. 45(2), 289–311. doi:

10.1137/S004058

5X97978166.

[4]

Barndorﬀ-Nielsen, O.E. (2011). Stationary inﬁnitely divisible processes. Braz. J.

Probab. Stat. 25(3), 294–322. doi: 10.1214/11-BJPS140.

[5]

Barndorﬀ-Nielsen, O.E. and A. Basse-O’Connor (2011). Quasi Ornstein–Uhlen-

beck processes. Bernoulli 17(3), 916–941. doi: 10.3150/10-BEJ311.

[6]

Barndorﬀ-Nielsen, O.E., T. Mikosch and S.I. Resnick (2012). Lévy processes:

theory and applications. Springer Science & Business Media.

[7]

Basse-O’Connor, A. and J. Rosiński (2016). On inﬁnitely divisible semimartin-

gales. Probab. Theory Related Fields 164(1–2), 133–163. doi:

10.1007/s00440-0

14-0609-1.

[8]

Bergstrom, A.R. (1990). Continuous time econometric modelling. Oxford Univer-

sity Press.

Introduction

[9]

Bertoin, J. (1996). Lévy processes. Vol. 121. Cambridge Tracts in Mathematics.

Cambridge University Press.

[10]

Bichteler, K. (1981). Stochastic integration and

-theory of semimartingales.

Ann. Probab. 9(1), 49–89.

[11]

Brockwell, P.J. and R.A. Davis (2006). Time series: theory and methods. Springer

Series in Statistics. Reprint of the second (1991) edition. Springer, New York.

[12]

Brockwell, P.J., R.A. Davis and Y. Yang (2011). Estimation for non-negative

Lévy-driven CARMA processes. J. Bus. Econom. Statist. 29(2), 250–259. doi:

10.1198/jbes.2010.08165.

[13]

Brockwell, P.J. and T. Marquardt (2005). Lévy-driven and fractionally inte-

grated ARMA processes with continuous time parameter. Statist. Sinica 15(2),

477–494.

[14]

Cheridito, P. (2004). Gaussian moving averages, semimartingales and option

pricing. Stochastic Process. Appl. 109(1), 47–68.

[15]

Cohen, S. and A. Lindner (2013). A central limit theorem for the sample auto-

correlations of a Lévy driven continuous time moving average process. J. Statist.

Plann. Inference 143(8), 1295–1306. doi: 10.1016/j.jspi.2013.03.022.

[16]

Comte, F. (1999). Discrete and continuous time cointegration. J. Econometrics

88(2), 207–226. doi: 10.1016/S0304-4076(98)00025-6.

[17]

Cont, R. and P. Tankov (2004). Financial modelling with jump processes. Chapman

& Hall/CRC Financial Mathematics Series. Chapman & Hall/CRC, Boca Raton,

FL.

[18]

Delbaen, F. and W. Schachermayer (1994). A general version of the fundamental

theorem of asset pricing. Math. Ann. 300(3), 463–520.

[19]

Dellacherie, C. (1980). “Un survol de la théorie de l’intégrale stochastique”.

Measure theory, Oberwolfach 1979 (Proc. Conf., Oberwolfach, 1979). Vol. 794.

Lecture Notes in Math. Springer, Berlin, 365–395.

[20]

Durrett, R. (1996). Stochastic calculus. Probability and Stochastics Series. A

practical introduction. CRC Press, Boca Raton, FL.

[21]

Dym, H. and H.P McKean (1976). Gaussian processes, function theory, and the

inverse spectral problem. Probability and Mathematical Statistics, Vol. 31. New

York: Academic Press [Harcourt Brace Jovanovich Publishers].

[22]

Gandolfo, G. (2012). Continuous-time econometrics: theory and applications.

Springer Science & Business Media.

[23]

Giraitis, L. and D. Surgailis (1990). A central limit theorem for quadratic forms

in strongly dependent linear variables and its application to asymptotical

normality of Whittle’s estimate. Probab. Theory Related Fields 86(1), 87–104.

doi: 10.1007/BF01207515.

References

[24]

Gushchin, A.A. and U. Küchler (2000). On stationary solutions of delay dif-

ferential equations driven by a Lévy process. Stochastic Process. Appl. 88(2),

195–211. doi: 10.1016/S0304-4149(99)00126-X.

[25]

Hamilton, J.D. (1994). Time series analysis. Princeton University Press, Prince-

ton, NJ, xvi+799.

[26]

Jacod, J. and A.N. Shiryaev (2003). Limit Theorems for Stochastic Processes. Sec-

ond. Vol. 288. Grundlehren der Mathematischen Wissenschaften [Fundamental

Principles of Mathematical Sciences]. Springer-Verlag, Berlin. doi:

10.1007/97

8-3-662-05265-5.

[27]

Johansen, S. (2009). “Cointegration: Overview and development”. Handbook of

ﬁnancial time series. Springer, 671–693.

[28]

Karhunen, K. (1950). Über die Struktur stationärer zufälliger Funktionen. Ark.

Mat. 1, 141–160. doi: 10.1007/BF02590624.

[29]

Lépingle, D. and J. Mémin (1978). Sur l’intégrabilité uniforme des martingales

exponentielles. Z. Wahrsch. Verw. Gebiete 42(3), 175–203. doi:

10.1007/BF0064

1409.

[30]

Mohammed, S.E.A. and M.K.R. Scheutzow (1990). Lyapunov exponents and

stationary solutions for aﬃne stochastic delay equations. Stochastics Stochastics

Rep. 29(2), 259–283.

[31]

Rajput, B.S. and J. Rosiński (1989). Spectral representations of inﬁnitely divisi-

ble processes. Probab. Theory Related Fields 82(3), 451–487.

[32]

Rosiński, J. (1989). On path properties of certain inﬁnitely divisible processes.

Stochastic Process. Appl. 33(1), 73–87. doi: 10.1016/0304-4149(89)90067-7.

[33]

Samorodnitsky, G. and M.S. Taqqu (1994). Stable Non-Gaussian Random Pro-

cesses. Stochastic Modeling. Stochastic models with inﬁnite variance. New York:

Chapman & Hall.

[34]

Sato, K. (1999). Lévy Processes and Inﬁnitely Divisible Distributions. Vol. 68. Cam-

bridge Studies in Advanced Mathematics. Translated from the 1990 Japanese

original, Revised by the author. Cambridge University Press.

[35]

Spangenberg, F. (2015). Limit theorems for the sample autocovariance of a

continuous-time moving average process with long memory. arXiv:

1502.0485

P a p e r

Equivalent Martingale Measures for

Lévy-Driven Moving Averages and Related

Processes

Andreas Basse-O’Connor, Mikkel Slot Nielsen and Jan Pedersen

Abstract

In the present paper we obtain suﬃcient conditions for the existence of equiva-

lent local martingale measures for Lévy-driven moving averages and other non-

Markovian jump processes. The conditions that we obtain are, under mild assump-

tions, also necessary. For instance, this is the case for moving averages driven by

an α-stable Lévy process with α ∈ (1,2].

Our proofs rely on various techniques for showing the martingale property of

stochastic exponentials.

MSC: 60E07; 60G10; 60G51; 60G57; 60H05

Keywords: Equivalent local martingale measures; Inﬁnite divisibility; Lévy processes; Moving

averages; Stochastic exponentials

1 Introduction and a main result

Absolutely continuous change of measure for stochastic processes is a classical prob-

lem in probability theory and there is a vast literature devoted to it. One motivation

is the fundamental theorem of asset pricing, see Delbaen and Schachermayer [12],

which relates existence of an equivalent local martingale measure to absence of ar-

bitrage (or, more precisely, to the concept of no free lunch with vanishing risk) of

a ﬁnancial market. Several sharp and general conditions for absolutely continuous

change of measure are given in [10, 19, 21, 24], and in case of Markov processes

and solutions to stochastic diﬀerential equations, strong and explicit conditions are

available, see e.g. [8, 11, 13, 17, 22] and references therein.

Paper A · Equivalent martingale measures for Lévy-driven moving averages and related

processes

The main aim of the present paper is to obtain explicit conditions for the existence

of an equivalent local martingale measure (ELMM) for Lévy-driven moving averages,

and these are only Markovian in very special cases. Moving averages are important

in various ﬁelds, e.g. because they are natural to use when modelling long-range

dependence (for other applications, see [23]). Recalling that Hitsuda’s representation

theorem characterizes when a Gaussian process admits an ELMM, see [15, Theo-

rem 6.3’], and Lévy-driven moving averages are inﬁnitely divisible processes, our

study can also be seen as a contribution to a similar representation theorem for this

class.

We will now introduce our framework. Consider a probability space (

Ω,F ,P

)

equipped on which a two-sided Lévy process

= (

)

t∈R

= 0, is deﬁned. Fix a time

horizon T > 0 and let

−∞

ϕ(t −s) dL

, t ∈ [0,T ], (1.1)

for a given function

ϕ : R

→ R

such that the integral in

(1.1)

is well-deﬁned. We will

refer to (

)

t∈[0,T ]

as a (stationary)

-driven moving average. To avoid trivial cases,

we assume that the set of

t ≥

0 with

(

)

0 is not a Lebesgue null set. We will ﬁx a

ﬁltration (F

)

t∈[0,T ]

with the property that

σ(L

: −∞ < s ≤ t) ⊆ F

, t ∈ [0,T ], (1.2)

and which satisﬁes the usual conditions (see [16, Deﬁnition 1.3 (Ch. I)]). Furthermore,

it will be assumed that (

)

t∈[0,T ]

is an (

)

t∈[0,T ]

-Lévy process in the sense that

−L

is independent of

for all 0

≤ s < t ≤ T

. Our aim is to ﬁnd explicit conditions that

ensure the existence of a probability measure

on (

Ω,F

), equivalent to

, under

which (

)

t∈[0,T ]

is a local martingale with respect to (

)

t∈[0,T ]

. Furthermore, we are

interested in the structure of (X

)

t∈[0,T ]

under Q.

A necessary condition for a process to admit an ELMM is that it is a semimartin-

gale, and this property is (under mild assumptions on the Lévy measure) character-

ized for

-driven moving averages in Basse-O’Connor and Rosiński [4] and Knight

[20]. Other relevant references in this direction include [3, 7]. In the case where

Gaussian, and relying on Knight [20, Theorem 6.5], Cheridito [7, Theorem 4.5] gives

a complete characterization of the L-driven moving averages that admit an ELMM:

Theorem 1.1 (Cheridito [7]).

Suppose that

is a Brownian motion. Then the moving

average (

)

t∈[0,T ]

deﬁned in

(1.1)

admits an ELMM if and only if

(0)

0 and

absolutely continuous with a density ϕ

satisfying ϕ

∈ L

Despite that, in general, the existence of an ELMM is a stronger condition than being

a semimartingale, Theorem 1.1 shows (together with [20, Theorem 6.5]) that for

Gaussian moving averages of the form

(1.1)

, the two concepts are equivalent when

(0)

0. If

has a non-trivial Lévy measure, explicit conditions for the existence

of an ELMM have, to the best of our knowledge, not been provided. It would be

natural to try to obtain such conditions using the same techniques as in Theorem 1.1.

However, these techniques are based on a local version of the Novikov condition,

which will not be fulﬁlled as soon as the driving Lévy process is non-Gaussian. This

is an implication of the fact that

εx

(

) =

∞

for any

ε >

0 and any non-Gaussian

1 · Introduction and a main result

inﬁnitely divisible distribution

, see [26, Theorem 26.1]. Consequently, to prove the

existence of an ELMM in a non-Gaussian setting, a completely diﬀerent approach

has to be used. An implication of our results is the non-Gaussian counterpart of

Theorem 1.1 which is formulated in Theorem 1.2 below. In this formulation,

c ≥

denotes the Gaussian component of

and

is its Lévy measure. Moreover, we will

write A

= R \A for the complement of a given set A ⊆ R.

Theorem 1.2.

Suppose that

has sample paths of locally unbounded variation and let

)

t∈[0,T ]

be a Lévy-driven moving average given by (1.1).

(1)

Assume that either

x 7→ F

((

−x, x

)

) is regularly varying at

∞

of index

β ∈

[

−

,−

|x|>1

(

)

< ∞

and that the support of

is unbounded on both (

−∞,

0] and

,∞

). Then (

)

t∈[0,T ]

admits an ELMM if and only if

(0)

0 and

is absolutely

continuous with a density ϕ

satisfying

∞



cϕ

(t)



|xϕ

(t)|∧|xϕ

(t)|



F(dx)



dt < ∞. (1.3)

(2) Assume that the support of F is contained in a compact set and

F((−∞,0)),F((0,∞)) > 0.

Then (

)

t∈[0,T ]

admits an ELMM if

(0)

0 and

is absolutely continuous with a

density ϕ

, which is bounded and satisﬁes (1.3).

is a symmetric

-stable Lévy process with

α ∈

2),

x 7→ F

((

−x, x

)

) is reg-

ularly varying of index

−α

and condition

(1.3)

is equivalent to

∈ L

(

) (see [4,

Example 4.9]). Thus, since we clearly have that the support of

is unbounded on

both (

−∞,

0] and [0

,∞

), we can apply Theorem 1.2(1) to obtain the following natural

extension of Theorem 1.1:

Corollary 1.3.

Suppose that

is a symmetric

-stable Lévy process with index

α ∈

2].

Then the moving average (

)

t∈[0,T ]

deﬁned in

(1.1)

admits an ELMM if and only if

ϕ(0) , 0 and ϕ is absolutely continuous with a density ϕ

satisfying ϕ

∈ L

A result similar to Corollary 1.3 can be formulated when

is a symmetric tem-

pered stable Lévy process that is, when the Lévy measure takes the form

(

) =

η|x|

−α−1

−λ|x|

for

η,λ >

0 and

α ∈

2). Indeed, since

(

)

< ∞

and

has

unbounded support on both (

−∞,

0] and [0

,∞

) in this setup, there exists an ELMM

for (

)

t∈[0,T ]

if and only if

(0)

0 and

∞

(

|ϕ

(

)

∧|ϕ

(

)

dt < ∞

(as the latter

condition is equivalent to (1.3) cf. [4, Example 4.9]).

It may be stressed that the Gaussian case considered in Theorem 1.1 and the

non-Gaussian case considered in Theorem 1.2 are of fundamental diﬀerent structure.

Indeed, when

is a Brownian motion, one can apply the martingale representation

theorem (when (

)

t∈[0,T ]

is the smallest ﬁltration that meets

(1.2)

and satisﬁes the

usual conditions) to show that the ELMM is unique, and by invariance of the quadratic

variation under equivalent change of measure, (

−X

)

t∈[0,T ]

is a Brownian motion

under the ELMM (one may need a semimartingale decomposition of (

)

t∈[0,T ]

, see

e.g.

(4.21)

). If

is a general Lévy process, Theorem 2.1 and Remark 2.2 in Section 2

Paper A · Equivalent martingale measures for Lévy-driven moving averages and related

processes

show that the ELMM will not be unique, and (

−X

)

t∈[0,T ]

and (

)

t∈[0,T ]

will not be

Lévy processes under any of our constructed ELMMs.

Besides the moving average framework we will also study ELMMs for semimartin-

gales of the form

= L

ds, t ∈ [0,T ], (1.4)

for a given (

)

t∈[0,T ]

-Lévy process (

)

t∈[0,T ]

and a predictable process (

)

t∈[0,T ]

such

that

t 7→ Y

is integrable on [0

] almost surely. This study turns out to be useful in

order to deduce results for moving averages.

We will shortly present the outline of this paper. Section 2 presents Theorem 2.1,

which concerns precise and tractable conditions on (

)

t∈[0,T ]

and (

)

t∈[0,T ]

ensuring

the existence of an ELMM for (

)

t∈[0,T ]

(1.4)

. An implication of this result is

Theorem 1.2 and in turn Corollary 1.3. Theorem 2.1 is followed by a predictable

criterion ensuring the martingale property of stochastic exponentials, Theorem 2.5,

and this is based on a general approach of Lépingle and Mémin [21]. Due to the nature

of this criterion, it can be used for other purposes than verifying the existence of

ELMMs for (

)

t∈[0,T ]

and thus, the result is of independent interest. Both Theorem 2.1

and Theorem 2.5 are accompanied by remarks and examples that illustrate their

applications. Subsequently, Section 3 recalls the most fundamental and important

concepts in relation to change of measure and integrals with respect to random

measures. These concepts will be used throughout Section 4 which is devoted to

prove the statements of Section 2. During Section 4 one will also ﬁnd additional

remarks and examples of a more technical nature.

2 Further main results

Let

= (

)

t∈[0,T ]

be an (

)

t∈[0,T ]

-Lévy process with triplet (

c, F,b

) relative to some

truncation function

h: R → R

. (Recall that a truncation function is a measurable

function

h: R → R

which is bounded and satisﬁes

(

) =

for

in a neighborhood of

0.) Here

∈ R

is the drift component,

c ≥

0 is the Gaussian component, and

is the

Lévy measure. Throughout the paper we will assume that

is integrable for every

t ∈

] which, according to [26, Corollary 25.8], is equivalent to

|x|>1

|x|F

(

)

< ∞

Then, we may set

(

x −h

(

))

(

) +

so that

[

] =

ξt

. We denote by

the

jump measure of

and by

(

dt, dx

) =

(

)

its compensator. It will be assumed

that L has both positive and negative jumps such that we can choose a,b > 0 with

F((−b,−a)),F((a,b)) > 0. (2.1)

In Theorem 2.1 we will give conditions for the existence of an ELMM

for (

)

t∈[0,T ]

given by

= L

ds, t ∈ [0,T ], (2.2)

where (

)

t∈[0,T ]

is a predictable process and

t 7→ Y

is Lebesgue integrable on [0

]

almost surely. We will also provide the semimartingale (diﬀerential) characteristics

of (

)

t∈[0,T ]

under

(these are deﬁned in [16, Ch. II] and can be found in Section 3

as well). Recall that the notation A

is used as the complement of a set A ⊆ R.

2 · Further main results

Theorem 2.1. Let (X

)

t∈[0,T ]

be given by (2.2). Consider the hypotheses:

(h1)

The collection (

)

t∈[0,T ]

is tight and

is inﬁnitely divisible with a Lévy measure

supported in [−K,K] for all t ∈ [0,T ] and some K > 0.

(h2) The Lévy measure of L has unbounded support on both (−∞,0] and [0, ∞).

If either (h1) or (h2) holds, there exists an ELMM

on (

Ω,F

) for (

)

t∈[0,T ]

such that

((

α −

∗

(

µ −ν

))

for some predictable function

α : Ω ×

]

×R →

,∞

and the diﬀerential characteristics of (X

)

t∈[0,T ]

relative to h under Q are of the form



c, α(t,x)F(dx),b

+ Y

(α(t,x) −1)h(x)F(dx)



, t ∈ [0,T ]. (2.3)

For any a, b > 0 that meet (2.1), depending on the hypothesis, Q can be chosen such that:

(h1) The function α is explicitly given by

α(t,x) = 1+

+ ξ)

−

(a,b)

(x) −

+ ξ)

−

(a,b)

(−x) (2.4)

where σ

(a,b)

(±y)F(dy).

(h2) With λ = F([−a,a]

), the relations

[−a,a]

α(t,x)F(dx) = λ and

[−a,a]

xα(t, x)F(dx) = −(Y

+ b

) (2.5)

hold pointwise, and α(t,x) = 1 whenever |x| ≤ a.

Remark 2.2.

Suppose that Theorem 2.1 is applicable. Observe that, for instance by

varying

a,b >

0, an ELMM for (

)

t∈[0,T ]

is not unique. In the following, ﬁx an ELMM

for (

)

t∈[0,T ]

, under which its characteristics have a diﬀerential form as in

(2.3)

relative to a truncation function

. As a ﬁrst comment we see that, as long as (

)

t∈[0,T ]

is not deterministic, the characteristic triplet under

of both (

)

t∈[0,T ]

and (

)

t∈[0,T ]

will not be deterministic. Consequently by [16, Theorem 4.15 (Ch. II)], none of them

have independent increments, in particular they will never be Lévy processes, under

. Despite the fact that (

)

t∈[0,T ]

does not have independent increments under

we may still extract some useful information from the diﬀerential characteristics.

Indeed, according to [16, Theorem 2.34 (Ch. II)], we may represent (

)

t∈[0,T ]

through

its canonical representation (under P) as

= X

+ h(x) ∗(µ −ν)

+ (x −h(x)) ∗µ

+ b

) ds, t ∈ [0,T ], (2.6)

where (

)

t∈[0,T ]

is the continuous martingale part of (

)

t∈[0,T ]

under

and

∗

denotes

integration, see Section 3 for more on the notation. Furthermore, recall that

∈ R

the drift component of (

)

t∈[0,T ]

relative to

and

is the jump measure associated

to (

)

t∈[0,T ]

(or equivalently, (

)

t∈[0,T ]

). Consider the speciﬁc truncation function

(

) =

(a,b)

(

|x|

) under (h1) and

(

) =

[−a,a]

(

) under (h2). From

(2.3)

and

(2.6)

deduce under Q:

(i) The process X

, t ∈ [0,T ], remains a Brownian motion with variance c.

Paper A · Equivalent martingale measures for Lévy-driven moving averages and related

processes

(ii)

It still holds that

(

)

∗

(

µ −ν

)

t ∈

], is a zero-mean Lévy process and its

distribution is unchanged.

(iii) The process

(x −h(x)) ∗µ

+ b

) ds, t ∈ [0,T ], (2.7)

is a local martingale, since (X

)

t∈[0,T ]

is a local martingale.

(iv)

Except for the drift term involving (

)

t∈[0,T ]

, it follows that the only component

(2.6)

aﬀected by the change of measure (under any of the hypotheses) is

(

x −h

(

))

∗µ

t ∈

], which goes from a compound Poisson process under

to a general cádlág and piecewise constant process under

. Speciﬁcally,

it will be aﬀected in such a way that it is compensated according to

(2.7)

. By

exploiting the structure of the compensator of

under

it follows that the

jumps of (

x −h

(

))

∗µ

t ∈

], still arrive according to a Poisson process (with

the same intensity as under

) under (h2) while under (h1), they will arrive

according to a counting process with a stochastic intensity. The (conditional)

jump distribution is obtained from Lemma 4.5.

Note that although, strictly speaking, the function

(

) =

(a,b)

(

|x|

) is not a genuine

truncation function, we are allowed to use it as such, since

|x|>1

|x|F

(

)

< ∞

assumption, which means the integrals in (2.6) will be well-deﬁned.

Remark 2.3.

As a ﬁrst comment on the hypotheses presented in the statement of

Theorem 2.1 we see that none of them is superior to the other one. Rather, there

is a trade oﬀ between the restrictions on (

)

t∈[0,T ]

and on (

)

t∈[0,T ]

. In line with

Remark 4.3, one may as well replace (h1) by

(h1’) For any t ∈ [0,T ] and a suitable ε > 0, Y

= Y

and E[e

ε|Y

|log(1+|Y

] < ∞.

The advantage of this hypothesis is that one is not restricted to the case where

inﬁnitely divisible, however the price to pay is to require that

= Y

rather than

the much weaker assumption of (Y

)

t∈[0,T ]

being tight.

Remark 2.4.

Suppose that (

)

t≥0

and (

)

t≥0

are deﬁned on the probability space

(

Ω,F ,

(

)

t≥0

) with

t≥0

and that Theorem 2.1 is applicable on the trun-

cated space (

Ω,F

(

)

t∈[0,T ]

,P|

) for any

T >

0. Then one can sometimes extend it

to a locally equivalent measure

on (

Ω,F

). A probability space having this property

is often referred to as being full. An example is the space of all càdlàg functions

taking values in a Polish space when equipped with its standard ﬁltration. For more

details, see [5] and [10]. As is the case for Lévy processes, we believe that such

will usually not be equivalent to P, and we have chosen not to pursue this direction

further.

Despite of a common structure in

(2.3)

under (h1) and (h2), the choices of

that we

suggest under the diﬀerent hypotheses in Theorem 2.1 diﬀer by their very nature.

This is a consequence of diﬀerent ways of constructing the ELMM.

The proof of the existence of an ELMM for (

)

t∈[0,T ]

consists of two steps. One step

is to identify an appropriate possible probability density

, that is, a positive random

2 · Further main results

variable which, given that

[

] = 1, deﬁnes an ELMM

on (

Ω,F

) for (

)

t∈[0,T ]

through

Z dP

. The candidate will always take the form

((

α −

∗

(

µ −ν

))

for some positive predictable function

. The remaining step is to check that

[

] = 1

or, equivalently,

((

α −

∗

(

µ −ν

)) is a martingale. Although there exist several sharp

results on when local martingales are true martingales, there has been a need for a

tractable condition which is suited for the speciﬁc setup in question, and this was the

motivation for Theorem 2.5. Speciﬁcally, it will be used to show Theorem 2.1 under

hypothesis (h1). As mentioned, the proof of Theorem 2.5 is based on a very general

approach presented by Lépingle and Mémin [21].

Theorem 2.5. Let W : Ω ×[0,T ] ×R → R

be a predictable function. Suppose that

W (ω,t, x) ≤|P

(ω)|g(x) for all (ω,t,x) ∈ Ω ×[0,T ] ×R, (2.8)

where the following hold:

(a) The process (P

)

t∈[0,T ]

is predictable and satisﬁes that

(i)

for some ﬁxed

K >

0 and any

t ∈

is inﬁnitely divisible with Lévy

measure supported in [−K,K], and

(ii) the collection of random variables (P

)

t∈[0,T ]

is tight.

(b) The function g : R → R

satisﬁes g + g log(1 + g) ∈ L

(F).

Then W ∗(µ −ν) is well-deﬁned and E (W ∗(µ −ν)) is a martingale.

The following example shows how this result compares to other classical refer-

ences for measure changes, when specializing to the case where

is the jump measure

of a Poisson process.

Example 2.6.

Suppose that

is a (homogeneous) Poisson process with intensity

λ >

and consider a density

((

α −

∗

(

µ −ν

))

for some positive predictable process

(

)

t∈[0,T ]

which paths are integrable on [0

] almost surely. Within the literature of

(marked) point processes, with this setup as a special case, one explicit and standard

criterion ensuring that

[

] = 1 is the existence of constants

0 and

γ >

such that

≤ K

+ K

+ λt) (2.9)

for all

t ∈

] almost surely, see [6, Theorem T11 (Ch. VIII)] or [14, Eq. (25)]. We

observe that the inequality in

(2.9)

implies that

(2.8)

holds with

g ≡

1 and

2 +

(

λt

t ∈

], where (

)

t∈[0,T ]

meets (i)–(ii) in Theorem 2.5, and thus

this criterion is implied by our result. Clearly, this also indicates that we cover other,

less restrictive, choices of (

)

t∈[0,T ]

. For instance, one could take

= 1 and replace

by any Lévy process with a compactly supported Lévy measure in

(2.9)

. Note that,

although we might have

−

0, Theorem 2.5 may still be applied according to

Remark 4.2. For other improvements of (2.9), see also [27].

Section 4 contains proofs of the statements above accompanied by a minor support-

ing result and a discussion of the techniques. However, we start by recalling some

fundamental concepts which will be (and already has been) used repeatedly.

Paper A · Equivalent martingale measures for Lévy-driven moving averages and related

processes

3 Preliminaries

The following consists of a short recap of fundamental concepts. For a more formal

and extensive treatment, see [16].

The stochastic exponential

(

) = (

(

)

t∈[0,T ]

of a semimartingale (

)

t∈[0,T ]

is characterized as the unique càdlàg and adapted process with

E (M)

= 1 +

E (M)

s−

, t ∈ [0,T ].

It is explicitly given as

E (M)

= e

−M

−

s≤t

(1 + ∆M

−∆M

, t ∈ [0,T ], (3.1)

where (

)

t∈[0,T ]

is the continuous martingale part of (

)

t∈[0,T ]

. If (

)

t∈[0,T ]

is a local

martingale,

(

) is a local martingale as well. Consequently whenever

(

)

≥

equivalently

∆M

≥ −

1, for all

t ∈

] almost surely,

(

) is a supermartingale.

(Here, and in the following, we have adopted the deﬁnition of a semimartingale from

[16], which in particular means that the process is càdlàg.)

A random measure on [0

]

×R

is a family of measures

such that for each

ω ∈ Ω

(

;

dt, dx

) is a measure on ([0

]

×R,B

([0

])

⊗B

(

)) satisfying

(

;

{

}×R

) = 0.

For our purpose,

will also satisfy that

(

;[0

]

×{

}

) = 0. Integration of a function

W : Ω ×

]

×R → R

with respect to

over the set (0

]

×R

is denoted

W ∗µ

for

t ∈

]. In this paper,

will always be the jump measure of some adapted càdlàg

process. To any such

, one can associate a unique (up to a null set) predictable

random measure

, which is called its compensator. We will always be in the case

where

(

;

dt, dx

) =

(

;

)

with (

(

))

t∈[0,T ]

being a predictable process for ev-

ery

B ∈ B

(

). One can deﬁne the stochastic integral with respect to the compensated

random measure

µ −ν

for any predictable function

W : Ω ×

]

×R → R

satisfying

that (

∗µ

)

1/2

t ∈

], is locally integrable. The associated integral process is

denoted W ∗(µ −ν).

Let

h: R → R

be a bounded function with

(

) =

in a neighbourhood of 0. The

characteristics of a semimartingale (

)

t∈[0,T ]

, relative to the truncation function

are then denoted (

C,ν,B

), which is unique up to a null set. Here

is the quadratic

variation of the continuous martingale part of (

)

t∈[0,T ]

is the predictable com-

pensator of its jump measure, and

is the predictable ﬁnite variation part of the

special semimartingale given by

−

s≤t

[

∆M

−h

(

∆M

)] for

t ∈

]. In the

case where

ds, ν(ω;dt, dx) = F

(ω;dx) dt and B

for some predictable processes (

)

t∈[0,T ]

and (

)

t∈[0,T ]

and transition kernel

(

;

we call (c

) the diﬀerential characteristics of (M

)

t∈[0,T ]

Suppose that we have another probability measure

on (

Ω,F

) such that

(

W ∗

(

µ−ν

))

, where

is the jump measure of an (

)

t∈[0,T ]

-Lévy process (

)

t∈[0,T ]

with characteristic triplet (

c, F,b

) relative to a given truncation function

and

the compensator of

. Then a version of Girsanov’s theorem, see [2] or [18], implies

4 · Proofs

that under

, (

)

t∈[0,T ]

is a semimartingale with diﬀerential characteristics (

c, F

where

(dx) = (1+ W (t,x))F(dx) and b

= b

W (t,x)h(x)F(dx). (3.2)

4 Proofs

In the following, let f : (−1, ∞) →R

be deﬁned by

f (x) = (1+ x)log(1 + x) −x, x > −1. (4.1)

In order to show Theorem 2.5 we will state and prove a local version of [21, Theo-

rem 1 (Section III)] below.

Lemma 4.1.

Let (

)

t∈[0,T ]

be a purely discontinuous local martingale with

∆M

> −

for all t ∈ [0,T ] almost surely. Suppose that the process

s≤t

f (∆M

), t ∈ [0,T ],

has compensator (

)

t∈[0,T ]

and that there exist stopping times 0 =

< τ

< ··· < τ

such that

exp

−

k−1

< ∞ for all k = 1,...,n. (4.2)

Then E (M) is a martingale.

Proof.

The following technique of proving the result is similar to the one used in the

proof of [24, Lemma 13]. For a given k ∈ {1,...,n} deﬁne the process

(k)

= M

t∧τ

−M

t∧τ

k−1

, t ∈ [0,T ].

Note that (

(k)

)

t∈[0,T ]

is a (purely discontinuous) local martingale and consequently,

E (M

(k)

) is a local martingale. Due to the jump structure

∆M

(k)











∆M

if t ∈ (τ

k−1

,τ

0 otherwise,

(4.3)

it holds that

s≤t



∆M

(k)



s≤t∧τ

f (∆M

) −

s≤t∧τ

k−1

f (∆M

), t ∈ [0,T ]. (4.4)

Consequently, the compensator of

(4.4)

is (

t∧τ

−

t∧τ

k−1

)

t∈[0,T ]

, and due to the as-

sumption in (4.2) it follows by [24, Theorem 8] that E (M

(k)

) is a martingale.

By [19, p. 404] we know that for k ∈ {1,...,n −1},



(k)





(k+1)



= E



(k)

+ M

(k+1)

+ [M

(k)

(k+1)

]



Using

(4.3)

and that (

(k)

)

t∈[0,T ]

is purely discontinuous, one ﬁnds that [

(k)

(k+1)

] =

0, so for any t ∈ [0,τ

E (M)

= E



l=1

(l)



l=1



(l)



Paper A · Equivalent martingale measures for Lévy-driven moving averages and related

processes

Since E (M

(l)

)

= E (M

(l)

)

k−1

for all t ≥ τ

k−1

and l < k,

E[E (M)

] = E



(k)





k−1

l=1



(l)



k−1

= E[E (M)

k−1

As a consequence, we get inductively that

[

(

)

] =

[

(

)

] = 1. By using the

fact that E (M) is a supermartingale, we have the result. 

Proof of Theorem 2.5.

We divide the proof into two steps; the ﬁrst step is to show

that assumptions (i)–(ii) on (P

)

t∈[0,T ]

imply that for any ε ∈ (0, 1/K),

sup

t∈[0,T ]

ε|P

|log(1+|P

< ∞, (4.5)

and the second step will use this fact to prove that

W ∗

(

µ−ν

)

t ∈

], is well-deﬁned

and that E (W ∗(µ −ν)) is a martingale.

Step 1: The idea is to use a procedure similar to the one in [26, Lemma 26.5] and

exploit the tightness property of (

)

t∈[0,T ]

to get a uniform result across

. In the

following we write

(u) B logE



−1 −ux1

[−1,1]

(x)



(dx) + b

u, u ∈ R,

for the Laplace exponent of

with associated triplet (

t ∈

], relative to

the truncation function

(

) =

[−1,1]

(

). By the compact support of

, it follows

from [26, Theorem 25.17] that

(

)

∈ R

is well-deﬁned for all

u ∈ R

and

t ∈

For ﬁxed t, it holds that Ψ

∈ C

∞

(u) = c

u +



−x1

[−1,1]

(x)



(dx) + b

, u ∈ R, (4.6)

and

0, see [26, Lemma 26.4]. From

(4.6)

and the inequality

−

| ≤ e

|x|

for

x ∈ [−K,K] and u ≥ 0, we get the bound

(u) ≤c

u + e

(dx) + b

+ KF

((1,K]). (4.7)

Now suppose that

sup

t∈[0,T ]

(

) =

∞

. Then, by the tightness of (

)

t∈[0,T ]

we may according to Prokhorov’s theorem choose a sequence (

)

n≥1

⊆

] and a

random variable P such that

−−−→P and lim

n→∞

(dx) = ∞. (4.8)

Since

is inﬁnitely divisible, it has an associated characteristic triplet (

c, ρ,b

). By [16,

Theorem 2.9 (Ch. VII)] it holds that

lim

n→∞

g dF

g dρ

for all

g : R → R

which are continuous, bounded, and vanishing in a neighbourhood

of 0. In particular, by the uniformly compact support of (

)

n≥1

, we get that

4 · Proofs

compactly supported. As a consequence, [16, Theorem 2.14 (Ch. VII)] and

(4.8)

imply

that

c +

ρ(dx) = lim

n→∞



(dx)



= ∞,

a contradiction, and we conclude that

sup

t∈[0,T ]

(

)

< ∞

. The same reasoning

gives that both

sup

t∈[0,T ]

and

sup

t∈[0,T ]

(

((1

])) are ﬁnite as well. From these

observations and (4.7) we deduce the existence of a constant C > 0 such that

(u) ≤C



1 + u + e



(4.9)

for u ≥ 0. We may without loss of generality assume that

lim

u→±∞

(u) = ∞ (4.10)

for all

t ∈

]. To see this, let

and

−

be standard Poisson random variables

which are independent of each other and of (P

)

t∈[0,T ]

, and consider the process

= P

+ K(N

−N

−

), t ∈ [0,T ].

This process still satisﬁes assumptions (i)–(ii) stated in Theorem 2.5, and the deriva-

tive of the associated Laplace exponents will necessarily satisfy

(4.10)

by the structure

(4.6)

, since

has a Lévy measure with mass on both (

−∞,

0] and [0

,∞

). Moreover,

the inequality

P(N

= 0)

−2

sup

t∈[0,T ]

ε|

|log(1+|

≥ sup

t∈[0,T ]

ε|P

|log(1+|P

implies that it suﬃces to show

(4.5)

for (

)

t∈[0,T ]

. Thus, we will continue under the

assumption that

(4.10)

holds. Now, by [26, Lemma 26.4] we may ﬁnd a constant

> 0 such that for any t, the inverse of Ψ

, denoted by θ

, exists on (ξ

,∞) and

P(P

≥ x) ≤ exp

−

(ξ) dξ

for any x > ξ

. (4.11)

Since

lim

ξ→∞

(

) =

∞

and

K −

/ε

0 for

∈

(

ε,

), it follows by

(4.9)

that

lim

ξ→∞

ξe

−θ

(ξ)/ε

= 0. In particular, by

(4.9)

once again, we can choose a

≥ ξ

(independent of

) such that

−θ

(

)

≤ −ε

logξ

for every

ξ ≥ ξ

. Combining this fact

with (4.11) gives that

P(P

≥ x) ≤ exp

−ε

logξ dξ

≤

−ε

x(logx−1)

for x > ξ

and t ∈ [0,T ],

where

is some constant independent of

. By estimating the probability

(

≤

−x

) =

(

−P

≥ x

) in a similar way it follows that

and

can be chosen large enough

to ensure that

G(x) B sup

t∈[0,T ]

P(|P

| ≥ x) ≤

−ε

x logx

for all t ∈ [0,T ] and x ≥ ξ. (4.12)

If we set G

(x) = P(|P

| ≥ x) for x ≥ 0, we have

ε|P

|log(1+|P

= −

∞

εx log(1+x)

(dx)

= 1 + ε

∞

εx log(1+x)



log(1 + x) +

1 + x



(x) dx

Paper A · Equivalent martingale measures for Lévy-driven moving averages and related

processes

using integration by parts, and this implies in turn that

sup

t∈[0,T ]

ε|P

|log(1+|P

≤ 1 + ε

∞

εx log(1+x)

(log(1 + x) + 1)G(x) dx < ∞

by (4.12). Consequently, we have shown that (4.5) does indeed hold.

Step 2: Arguing that

W ∗

(

µ −ν

)

t ∈

], is well-deﬁned amounts to showing that

E[|W |∗ν

] < ∞. This is clearly the case since by (2.8),

E[W ∗ν

] ≤ T sup

t∈[0,T ]

E[|P

g(x)F(dx),

and the right-hand side is ﬁnite by (4.5). By deﬁnition we have the equality

f (W ) ∗µ

s≤t

f (∆(W ∗(µ −ν))

), t ∈ [0,T ],

and the compensator of the process exists and is given as

A = f (W ) ∗ν, since

] ≤ T





1 + sup

t∈[0,T ]

E[|P



g(x)log(1 + g(x))F(dx)

+ sup

t∈[0,T ]

E[|P

|log(1 + |P

|)]

g(x)F(dx)



which is ﬁnite by assumption (b) and

(4.5)

. In the following we will argue that

(4.2)

Lemma 4.1 is satisﬁed for

≡ t

= 0

,... , n

, for suitable numbers 0 =

< t

< ··· <

, which subsequently allows us to conclude that

(

W ∗

(

µ −ν

)) is a martingale.

Fix 0 ≤ s < t ≤ T and note that (2.8) implies

−

≤

f (|P

|g(x))F(dx) du, (4.13)

since

is increasing on

. We want to obtain a bound on

(

)

(

))

(

)

for y ≥ 0. First note that f (x) ≤ x log(1+ x) whenever x ≥ 0, so

h(y)

y logy

≤

g(x)



1 +

log(1 + g(x))

logy



F(dx) y > 0.

Consequently, due to assumption (b), we can apply Lebesgue’s theorem on dominated

convergence to deduce that

limsup

y→∞

h(y)

y log(1 + y)

< γ

for some γ

∈ (0,∞).

By monotonicity of

we may ﬁnd

∈

,∞

) so that we obtain the bound

(

)

≤

y log

(1 +

) +

for all

y ≥

0. Thus for all 0

≤ s < t ≤ T

, we have established the

estimate

h(|P

|) du ≤ γ

|log(1 + |P

|) du + γ

(t −s). (4.14)

4 · Proofs

Now choose a partition 0 =

< t

< ··· < t

with

−t

k−1

≤ ε/γ

for some small

number

satisfying

(4.5)

holds. By

(4.13)

and

(4.14)

it follows by an application of

Jensen’s inequality and Tonelli’s theorem that

−γ

−t

k−1

)

−

k−1

≤ E

exp

k−1

|log(1 + |P

|) dt

≤

−t

k−1

ε|P

|log(1+|P

≤ sup

t∈[0,T ]

ε|P

|log(1+|P

which is ﬁnite and, thus, the proof is completed. 

Remark 4.2.

It appears from

(4.13)

above that if

(

)

< ∞

(

u, x

) = 0 for

x ∈

(

−δ, δ

) with

δ >

0, one may allow that

takes values in (

−

,∞

) by assuming that

|W (t,x)| ≤ |P

|g(x) and replacing the inequality with

−

≤ M(t −s) +

f (|P

|g(x))F(dx) du

for a suitable

M >

0. From this point, one can complete the proof in the same way as

above and get that E (W ∗(µ −ν)) is a martingale.

Remark 4.3.

Note that there are other sets of assumptions that can be used to show

Theorem 2.5, but they will not be superior to those suggested. Furthermore, the

assumptions that we suggest are natural in order to formulate Theorem 2.1 in a way

which in turn is suited for proving Theorem 1.2 in the introduction. However, by

adjusting the set of assumptions in Theorem 2.5, one may obtain similar adjusted

versions of Theorem 2.1 (see the discussion in Remark 2.3). In the bullet points below

we shortly point out which properties the assumptions should imply and suggest

other choices as well.

•

The importance of (i)–(ii) is that they ensure

(4.5)

holds. Thus, it follows that

one may replace these by

= P

for

t ∈

] and

[

ε|P

|log(1+|P

]

< ∞

for

some ε > 0.

• Instead of assuming that (P

)

t∈[0,T ]

is a process satisfying (4.5) and g + g log(1 +

g) ∈ L

(F), one may do a similar proof under the assumptions that

sup

t∈[0,T ]

ε|P

< ∞

and

g ∈L

(

) for some

ε >

0 and

γ ∈

2]. In particular, one may allow for less

integrability of F around zero for the cost of more integrability of (P

)

t∈[0,T ]

Example 4.4 below shows that one cannot relax assumption (i) in Theorem 2.5

and still apply (a localized version of) the approach of Lépingle and Mémin [21].

Moreover, it appears that this approach cannot naturally be improved in the sense of

obtaining a weaker condition than (4.5) in order to relax assumption (i).

Paper A · Equivalent martingale measures for Lévy-driven moving averages and related

processes

Example 4.4.

Consider the case where

(

t,x

) =

|Y x|

for some

-measurable in-

ﬁnitely divisible random variable

with an associated Lévy measure which has

unbounded support. Moreover, suppose that

[

]

< ∞

and that

is given such

that Theorem 2.5(b) holds with

(

) =

|x|

. Then

W ∗

(

µ −ν

) is well-deﬁned, and the

compensator of

(

)

∗µ

exists and is given by

(

)

∗ν

(in the notation of

(4.1)

Following the same arguments as in the proof of Theorem 2.5 we obtain that

f (W ) ∗ν

≥ c

tY log(1 + Y ) −c

t, t ∈ [0,T ],

for suitable c

> 0. Consequently,

f (W )∗ν

≥ E

tY log(1+Y )

−c

= ∞

for any

t >

0 by [26, Theorem 26.1]. Thus, Lemma 4.1 cannot be applied if we remove

assumption (i) in Theorem 2.5. Naturally, one can ask if it will be suﬃcient that

[

f (W )∗ν

]

< ∞

for another measurable function

f :

(

−

,∞

)

→ R

? The idea in the

proof of [21, Theorem 1 (Section III)] is build on the assumption that

is a function

with (1

−λ

)

(

)

≥

1 +

λx −

(1 +

)

for all

x > −

1 and

λ ∈

1). In particular, this

requires that

f (x) ≥ lim

λ↑1

1 + λx −(1 + x)

1 −λ

= f (x)

for all x > −1, and thus any other candidate function will be (uniformly) worse than

f .

Before proving Theorem 2.1 we will need a small result, which is stated and

proven in Lemma 4.5 below. While the result may be well-known, we have not been

able to ﬁnd an appropriate reference. To a given adapted process (

)

t∈[0,T ]

such that

t 7→ M

(

) is a cádlág step function for each

ω ∈ Ω

, we deﬁne for

n ≥

1 its

th jump

time and size by

= inf{t ∈ (T

n−1

,T ) : ∆M

, 0} ∈ (0, T ] and Z

= ∆M

, (4.15)

respectively. Here we set T

≡ 0 and inf∅ = T .

Lemma 4.5.

Assume that the jump measure

of some càdlàg adapted process (

)

t∈[0,T ]

has a predictable compensator

of the form

(

dt, dx

) =

(

)

, where (

(

))

t∈[0,T ]

is a

predictable process for every

B ∈ B

(

) and

B G

(

)

∈

,∞

) for

t ∈

]. Then, in the

notation of (4.15), it holds that

P(Z

∈ B | F

−

) = Φ

(B) on {T

< T } (4.16)

for any n ≥ 1 and B ∈ B (R), where Φ

B G

/λ

Proof.

To show

(4.16)

, ﬁx

n ≥

1 and

B ∈ B

(

). Note that

−

is generated by sets of

the form A ∩{t < T

} for t ∈ [0,T ) and A ∈ F

. Consequently, it suﬃces to argue that

E[1

A∩{t<T

<T }

)] = E[1

A∩{t<T

<T }

(B)]. (4.17)

Deﬁne the functions φ,ψ : Ω ×[0,T ] ×R → R by

φ(s,x) = 1

A∩{t<T

}

n−1

≤t}

(t,T

]×B

(s, x) + 1

n−1

>t}

n−1

]×B

(s, x)]1

(0,T )

(s)

4 · Proofs

and

ψ(s,x) = 1

A∩{t<T

}

(B)[1

n−1

≤t}

(t,T

]

(s) + 1

n−1

>t}

n−1

]

(s)]1

(0,T )

(s),

and note that they are both predictable. Furthermore, we observe that the functions

are deﬁned such that

φ ∗J

= 1

A∩{t<T

<T }

), ψ ∗J

= 1

A∩{t<T

<T }

(B),

and

φ ∗ρ

= 1

A∩{t<T

<T }

(B)[1

n−1

≤t}

(t,T

]

(s) + 1

n−1

>t}

n−1

]

(s)] ds = ψ ∗ρ

Using these properties together with the dual relations

[

φ ∗ J

] =

[

φ ∗ ρ

] and

E[ψ ∗ρ

] = E[ψ ∗ρ

] we obtain (4.17), and this gives the result. 

Using Lemma 4.5 it follows by a Monotone Class argument that on {T

< T },

E[g(Z

) | F

−

] =

g(x)Φ

(dx) (4.18)

for any function

g : Ω ×R → R

which is

−

⊗B

(

)-measurable. With this fact and

Theorem 2.5 in hand, we are ready to prove Theorem 2.1.

Proof of Theorem 2.1.

We prove the result depending on the diﬀerent hypotheses.

In both cases the proof goes by arguing that

((

α −

∗

(

µ−ν

)) is a martingale and that

the probability measure

deﬁned by

((

α −

∗

(

µ −ν

))

is an ELMM for

(

)

t∈[0,T ]

. Since the diﬀerential characteristics of (

)

t∈[0,T ]

under

coincide with its

characteristic triplet (

c, F,b

), it follows directly from

(3.2)

that if

is a probability

measure, the diﬀerential characteristics of (

)

t∈[0,T ]

under

are given as in

(2.3)

. In

the following we have ﬁxed a,b > 0 such that (2.1) holds.

Case (h1): Consider the speciﬁc predictable function

given by

(2.4)

. Then

(

t,x

)

≥

1 and in particular,

((

α −

∗

(

µ −ν

))

0. Moreover,

(

t,x

)

−

≤ |P

(

)

with

and

(

) =

(a,b)

(

|x|

)

|x|

for some constant

C >

0. Since

is just a

constant, (

)

t∈[0,T ]

inherits the properties in (h1) of (

)

t∈[0,T ]

, thus (a) in Theorem 2.5

is satisﬁed. Likewise,

g[1 + log(1 + g)] dF =

|x|∈(a,b)

C|x|[1 + log(1 + C|x|)]F(dx) < ∞,

which shows that (b) is satisﬁed as well, and we conclude by Theorem 2.5 that

((

α −

∗

(

µ −ν

)) is a martingale. To argue that (

)

t∈[0,T ]

is a local martingale under

the associated probability measure Q, note that it suﬃces to show that

|x|∈(a,b)

xα(t, x)F(dx) = −(Y

+ b

)

(2.3)

and

(2.7)

in Remark 2.2, where

∈ R

is the drift component in the charac-

teristic triplet of

with respect to the (pseudo) truncation function

(

) =

(a,b)

(

|x|

Paper A · Equivalent martingale measures for Lévy-driven moving averages and related

processes

Thus, we compute

|x|∈(a,b)

xα(t, x)F(dx)

|x|∈(a,b)

x F(dx) +

+ ξ)

−

(a,b)

F(dx) −

+ ξ)

−

(−b,−a)

F(dx)

|x|∈(a,b)

x F(dx) −(Y

+ ξ)

|x|∈(a,b)

x F(dx) −Y

−

|x|∈(a,b)

x F(dx) −b

= −(Y

+ b

and the result is shown under hypothesis (h1).

Case (h2): Set

(

· ∩

[

−a,a

]

). Note that

((

−∞,ζ

))

((

ζ, ∞

))

0 for any

ζ ∈ R

by assumption, and this implies that we may ﬁnd a strictly positive density

: R → (0,∞) such that

[−a,a]

(x)

(R)

F(dx) = 1 and

[−a,a]

(x)

(R)

F(dx) = ζ. (4.19)

To see this, assume that

is a random variable on (

Ω,F ,P

) with

= F

(

Then, since E[X | X < ζ] < ζ < E[X | X ≥ζ], we may deﬁne

λ(ζ) =

ζ −E[X |X < ζ]

E[X | X ≥ ζ] −E[X |X < ζ]

∈ (0,1) (4.20)

and

(B) = (1 −λ(ζ))P(X ∈ B | X < ζ) + λ(ζ)P(X ∈ B | X ≥ ζ), B ∈ B (R).

Note that

is a probability measure which is equivalent to

(

X ∈ ·

) =

(

)

and has mean

. Thus, the density

/ dP

(

X ∈ ·

) is a function that satisﬁes

(4.19)

Moreover, such a density is explicitly given by

(x) =

1 −λ(ζ)

P(X < ζ)

(−∞,ζ)

(x) +

λ(ζ)

P(X ≥ ζ)

[ζ,∞)

(x), x ∈ R,

and we thus see that the map (x,ζ) 7→f

(x) is B (R

)-measurable. By letting

α(t,x) = f

−(Y

)/F

(R)

(x)

for

|x| > a

and

(

t,x

) = 1 for

|x| ≤ a

, we obtain a predictable function

, which is

strictly positive and satisﬁes

(2.5)

. Thus, it suﬃces to argue that an

with these

properties deﬁnes an ELMM for (

)

t∈[0,T ]

through

((

α −

∗

(

µ −ν

))

. First observe

that (α −1) ∗(µ −ν) is well-deﬁned, since

|α −1|∗ν

[−a,a]

|α(s,x) −1|F(dx) ds ≤ 2F

(R)T .

The ﬁrst property in

(2.5)

and the fact that

(

t,x

) = 1 when

|x| ≤ a

imply that (

α −

∗

= 0 for t ∈ [0,T ]. Consequently,

E ((α −1) ∗(µ −ν))

= E ((α −1) ∗µ)

= e

(α−1)∗µ

+(logα−(α−1))∗µ

n=1

α(T

4 · Proofs

where (

)

n≥1

is deﬁned as in

(4.15)

for the compound Poisson process

[−a,a]

(

)

∗

t ∈

], and

[−a,a]

(

)

∗µ

t ∈

], is the underlying Poisson process that

counts the number of jumps. In particular, for any given n ≥ 1, we have

E[E ((α −1) ∗µ)

| F

n−1

] = E ((α −1) ∗µ)

n−1

E[E[α(T

) | F

−

] | F

n−1

]

= E ((α −1) ∗µ)

n−1

almost surely by the inclusion F

n−1

⊆ F

−

, if we can show that

E[α(T

) | F

−

] = 1.

(Here we recall that

≡

0.) However, this follows from the observations that

α(T

) = 1 almost surely on the set {T

= T } (since Z

= 0) and

E[α(T

) | F

−

] = F

(R)

−1

[−a,a]

α(T

,x)F(dx) = 1

almost surely on

< T }

. The latter observation is implied by

(2.5)

and

(4.18)

, since

(

ω, x

)

7→ α

(

ω, T

(

)

) is

−

⊗B

(

)-measurable. Consequently, (

((

α −

∗µ

)

n≥0

is a positive

-martingale with respect to the ﬁltration (

)

n≥0

and its mean is

constantly equal to one, so we may deﬁne a probability measure

/ dP

((

α −

∗µ

)

for each

n ≥

1. By

(3.2)

it follows that the compensator

under

is [

(

t,x

)

{t≤T

}

{t>T

}

]

(

)

. From this we get that the counting

process 1

[−a,a]

(x) ∗µ

, t ∈ [0,T ], is compensated by

[−a,a]

(x)[α(s,x)1

{s≤T

}

+ 1

{s>T

}

]F(dx) ds = F

(R)t, t ∈ [0,T ],

under

using

(2.5)

. This shows that jumps continue to arrive according to a Poisson

process with intensity

(

) (see, e.g., [16, Theorem 4.5 (Ch. II)]), which in turn

implies that

E[E ((α −1) ∗µ)

<T }

] = Q

< T ) = P(T

< T ) → 0, n → ∞.

As a consequence,

1 = lim

n→∞

E[E ((α −1) ∗µ)

<T }

] + lim

n→∞

E[E ((α −1) ∗µ)

=T }

]

= lim

n→∞

E[E ((α −1) ∗µ)

=T }

]

= E[E ((α −1) ∗µ)

This shows that

deﬁned by

((

α −

∗µ

)

is a probability measure on

(

Ω,F

). To show that (

)

t∈[0,T ]

is a local martingale under

we just observe that the

compensator of x1

[−a,a]

(x) ∗µ

, t ∈ [0,T ], is given by

[−a,a]

xα(s,x)F(dx) ds = −

+ b

) ds, t ∈ [0,T ],

according to (2.5). Thus, (2.7) holds and the proof is complete by Remark 2.2. 

Finally, we use Theorem 2.1 to prove Theorem 1.2.

Paper A · Equivalent martingale measures for Lévy-driven moving averages and related

processes

Proof of Theorem 1.2.

Without loss of generality we assume that

[

] = 0. Suppose

that (

)

t∈[0,T ]

admits an ELMM. Then, it is a semimartingale and the assumptions

imposed imply by [4, Theorem 4.1 and Corollary 4.8] that

is absolutely continuous

with a density ϕ

satisfying (1.3). Moreover, we ﬁnd for any t ∈ [0,T ] that

−X

ϕ(t −s) dL

−∞

[ϕ(t −s) −ϕ(−s)] dL

= ϕ(0)L

t−s

−s

(u) du dL

−∞

t−s

−s

(u) du dL

= ϕ(0)L

(u −s) dL

du +

−∞

(u −s) dL

= ϕ(0)L

du (4.21)

where

−∞

(

u −s

)

. (Here we have applied a stochastic Fubini result, which

may be found in [1, Theorem 3.1]. Moreover, we have extended the functions

and

from

by setting

(

) =

(

) = 0 for

t <

0.) Note that, according to [9] and [16,

Theorem 2.28 (Ch. I)], we may choose (

)

t∈[0,T ]

predictable. From this representation

we ﬁnd that ϕ(0) , 0, since otherwise an ELMM for (X

)

t∈[0,T ]

would imply ϕ ≡ 0.

Conversely, if

has

(0)

0, is absolutely continuous, and the density

meets

(1.3)

, we get from [4, Theorem 4.1 and Corollary 4.8] that (

)

t∈[0,T ]

is a semimartin-

gale that takes the form

(4.21)

. Since the support of the Lévy measure of (

(0)

)

t∈[0,T ]

is unbounded on both (

−∞,

0] and [0

,∞

), hypothesis (h2) of Theorem 2.1 is satisﬁed

and we deduce the existence of an ELMM for (

)

t∈[0,T ]

. Suppose now instead that

the density

is bounded, the support of

(the Lévy measure of

) is bounded, and

((

−∞,

0))

((0

,∞

))

0. Then we observe initially that, according to [25], (

)

t∈[0,T ]

is a stationary process, in particular tight, under

and the law of

is inﬁnitely

divisible with a Lévy measure given by

(B) = (F ×Leb)



{(x, s) ∈ R ×(0,∞) : xϕ

(s) ∈B \{0}}



, B ∈ B (R).

(Here Leb denotes the Lebesgue measure on (0

,∞

).) In particular, if

C >

0 is a constant

that bounds ϕ

we get the inequality

([−M,M]

) ≤ (F ×Leb)



[−

]

×(0,∞)



for any

M >

0, and this shows that the Lévy measure of

is compactly supported

since the same holds for

. In this case, (h1) of Theorem 2.1 holds and we can conclude

that an ELMM for (X

)

t∈[0,T ]

exists. 

Remark 4.6.

A natural comment is on the existence of

a,b >

0 with the property

(2.1)

. In light of the structure of the ELMM presented in Theorem 2.1, discussed in

Remark 2.2, this assumption seems very natural. Indeed, assume that the triplet of

is given relative to the truncation function

(

) =

[−a,a]

(

) and set

. Then,

according to (2.7), we try to ﬁnd Q under which

{|x|>a}

∗µ

{x>a}

∗µ

−

|x|1

{x<−a}

∗µ

−

, t ∈ [0,T ], (4.22)

References

is a local martingale. Intuitively,

should ensure that positive jumps are compensated

by the negative drift part and vice versa. Clearly, this construction is not possible if

(2.1)

does not hold for any

a,b >

0 and all jumps are of same sign. In case all jumps

are of the same sign, it may sometimes be possible to construct

, although the recipe

becomes rather case speciﬁc. For instance, if all jumps of

are positive, one may still

make the desired change of measure under (h1) or under the hypothesis that

has

unbounded support on (0

,∞

), provided that the second term in

(4.22)

is not present.

Even in the case where the term

(

)

t ∈

], is non-zero, it might possibly

be absorbed by a change of drift of the Gaussian component in L if such exists.

Acknowledgments

We thank the referee for a clear and constructive report. This work was supported by

the Danish Council for Independent Research (grant DFF–4002–00003).

References

[1]

Barndorﬀ-Nielsen, O.E. and A. Basse-O’Connor (2011). Quasi Ornstein–Uhlen-

beck processes. Bernoulli 17(3), 916–941. doi: 10.3150/10-BEJ311.

[2]

Barndorﬀ-Nielsen, O.E. and A. Shiryaev (2015). Change of Time and Change of

Measure. Second. Advanced Series on Statistical Science & Applied Probability,

21. World Scientiﬁc Publishing Co. Pte. Ltd., Hackensack, NJ, xviii+326. doi:

10.1142/9609.

[3]

Basse-O’Connor, A. and J. Pedersen (2009). Lévy driven moving averages and

semimartingales. Stochastic Process. Appl. 119(9), 2970–2991. doi:

10.1016/j

.spa.2009.03.007.

[4]

Basse-O’Connor, A. and J. Rosiński (2016). On inﬁnitely divisible semimartin-

gales. Probab. Theory Related Fields 164(1–2), 133–163. doi:

10.1007/s00440-0

14-0609-1.

[5]

Bichteler, K. (2002). Stochastic integration with jumps. Vol. 89. Encyclopedia of

Mathematics and its Applications. Cambridge University Press, xiv+501. doi:

10.1017/CBO9780511549878.

[6]

Brémaud, P. (1981). Point Processes and Queues. Martingale dynamics, Springer

Series in Statistics. Springer-Verlag, New York-Berlin, xviii+354.

[7]

Cheridito, P. (2004). Gaussian moving averages, semimartingales and option

pricing. Stochastic Process. Appl. 109(1), 47–68.

[8]

Cheridito, P., D. Filipović and M. Yor (2005). Equivalent and absolutely contin-

uous measure changes for jump-diﬀusion processes. Ann. Appl. Probab. 15(3),

1713–1732. doi: 10.1214/105051605000000197.

[9]

Cohn, D.L. (1972). Measurable choice of limit points and the existence of

separable and measurable processes. Z. Wahrsch. Verw. Gebiete 22, 161–165.

Paper A · Equivalent martingale measures for Lévy-driven moving averages and related

processes

[10]

Criens, D. (2016). Structure Preserving Equivalent Martingale Measures for

SCII Models. arXiv: 1606.02593.

[11]

Dawson, D.A. (1968). Equivalence of Markov processes. Trans. Amer. Math. Soc.

131, 1–31.

[12]

Delbaen, F. and W. Schachermayer (1994). A general version of the fundamental

theorem of asset pricing. Math. Ann. 300(3), 463–520.

[13]

Eberlein, E. and J. Jacod (1997). On the range of options prices. English. Finance

Stoch. 1(2), 131–140.

[14]

Gjessing, H.K., K. Røysland, E.A. Pena and O.O. Aalen (2010). Recurrent

events and the exploding Cox model. Lifetime Data Anal. 16(4), 525–546. doi:

10.1007/s10985-010-9180-y.

[15]

Hida, T. and M. Hitsuda (1993). Gaussian Processes. Vol. 120. Translations of

Mathematical Monographs. Translated from the 1976 Japanese original by the

authors. Providence, RI: American Mathematical Society, xvi+183.

[16]

Jacod, J. and A.N. Shiryaev (2003). Limit Theorems for Stochastic Processes. Sec-

ond. Vol. 288. Grundlehren der Mathematischen Wissenschaften [Fundamental

Principles of Mathematical Sciences]. Springer-Verlag, Berlin. doi:

10.1007/97

8-3-662-05265-5.

[17]

Kabanov, Y.M, R.S. Liptser and A.N. Shiryayev (1980). “On absolute continuity

of probability measures for Markov-Itô processes”. Stochastic diﬀerential systems

(Proc. IFIP-WG 7/1 Working Conf., Vilnius, 1978). Vol. 25. Lecture Notes in

Control and Information Sci. Springer, Berlin-New York, 114–128.

[18]

Kallsen, J. (2006). “A didactic note on aﬃne stochastic volatility models”.

From stochastic calculus to mathematical ﬁnance. Springer, Berlin, 343–368. doi:

10.1007/978-3-540-30788-4_18.

[19]

Kallsen, J. and A.N. Shiryaev (2002). The cumulant process and Esscher’s

change of measure. Finance Stoch. 6(4), 397–428. doi:

10.1007/s00780020006

[20]

Knight, F.B. (1992). Foundations of the Prediction Process. Vol. 1. Oxford Studies

in Probability. Oxford Science Publications. New York: The Clarendon Press

Oxford University Press, xii+248.

[21]

Lépingle, D. and J. Mémin (1978). Sur l’intégrabilité uniforme des martingales

exponentielles. Z. Wahrsch. Verw. Gebiete 42(3), 175–203. doi:

10.1007/BF0064

1409.

[22]

Mijatović, A. and M. Urusov (2012). On the martingale property of certain

local martingales. Probab. Theory Related Fields 152(1-2), 1–30. doi:

10.1007/s

00440-010-0314-7.

[23]

Podolskij, M. (2015). “Ambit ﬁelds: survey and new challenges”. XI Symposium

on Probability and Stochastic Processes. Springer, 241–279.

References

[24]

Protter, P. and K. Shimbo (2008). “No arbitrage and general semimartingales”.

Markov processes and related topics: a Festschrift for Thomas G. Kurtz. Vol. 4.

Inst. Math. Stat. Collect. Inst. Math. Statist., Beachwood, OH, 267–283. doi:

10.1214/074921708000000426.

[25]

Rajput, B.S. and J. Rosiński (1989). Spectral representations of inﬁnitely divisi-

ble processes. Probab. Theory Related Fields 82(3), 451–487.

[26]

Sato, K. (1999). Lévy Processes and Inﬁnitely Divisible Distributions. Vol. 68. Cam-

bridge Studies in Advanced Mathematics. Translated from the 1990 Japanese

original, Revised by the author. Cambridge University Press.

[27]

Sokol, A. and N.R. Hansen (2015). Exponential martingales and changes of

measure for counting processes. Stoch. Anal. Appl. 33(5), 823–843. doi:

10.108

0/07362994.2015.1040890.

P a p e r

Stochastic Delay Diﬀerential Equations and

Related Autoregressive Models

Andreas Basse-O’Connor, Mikkel Slot Nielsen, Jan Pedersen and Victor Rohde

Abstract

In this paper we suggest two continuous-time models which exhibit an autore-

gressive structure. We obtain existence and uniqueness results and study the

structure of the solution processes. One of the models, which corresponds to

general stochastic delay diﬀerential equations, will be given particular attention.

We use the obtained results to link the introduced processes to both discrete-time

and continuous-time ARMA processes.

MSC: 60G10; 60G22; 60H10; 60H20

Keywords: Autoregressive structures; Stochastic delay diﬀerential equations; Processes of

Ornstein–Uhlenbeck type; Long-range dependence; CARMA processes; Moving averages

1 Introduction

Let (

)

t∈R

be a two-sided Lévy process and

ψ : R → R

some measurable function

which is integrable with respect to (

)

t∈R

(in the sense of [23]). Processes of the form

ψ(t −u) dL

, t ∈ R, (1.1)

are known as (stationary) continuous-time moving averages and have been studied ex-

tensively. Their popularity may be explained by the Wold–Karhunen decomposition:

up to a drift term, essentially any stationary and square integrable process admits a

representation of the form

(1.1)

with (

)

t∈R

replaced by a process with second order

stationary and orthogonal increments. For details on this type of representations, see

[28, Section 26.2] and [2, Theorem 4.1]. Note that the model

(1.1)

nests the discrete-

time moving average with ﬁlter (

)

j∈Z

(at least when it is driven by an inﬁnitely

Paper B · Stochastic delay diﬀerential equations and related autoregressive models

divisible noise), since one can choose

(

) =

j∈Z

(j−1,j]

(

). Another example of

(1.1)

is the Ornstein–Uhlenbeck process corresponding to

(

) =

−λt

[0,∞)

(

) for

λ >

0. Ornstein–Uhlenbeck processes often serve as building blocks in stochastic

modeling, e.g. in stochastic volatility models for option pricing as illustrated in [4]

or in models for (log) spot price of many diﬀerent commodities, e.g., as in [26]. A

generalization of the Ornstein–Uhlenbeck process, which is also of the form

(1.1)

, is

the CARMA process. To be concrete, for two real polynomials

and

, of degree

and

(

p > q

) respectively, with no zeroes on

{z ∈ C

(

) = 0

}

, choosing

ψ : R → R

be the function characterized by

−ity

ψ(t) dt =

Q(iy)

P (iy)

, y ∈ R,

results in a CARMA process. CARMA processes have found many applications, and

extensions to account for long memory and to a multivariate setting have been made.

For more on CARMA processes and their extensions, see [9, 10, 14, 19, 27]. Many

general properties of continuous-time moving averages are well understood. This

includes when they have long memory and have sample paths of ﬁnite variation (or,

more generally, are semimartingales). For an extensive treatment of these processes

and further examples we refer to [5, 6] and [3], respectively.

Instead of specifying the kernel

(1.1)

directly it is often preferred to view

(

)

t∈R

as a solution to a certain equation. For instance, as an alternative to

(1.1)

the Ornstein–Uhlenbeck process with parameter

λ >

0, respectively the discrete-

time moving average with ﬁlter

j≥1

for some

α ∈ R

with

|α| <

1, may be

characterized as the unique stationary process that satisﬁes

= −λX

dt + dL

, t ∈ R, (1.2)

respectively X

= αX

t−1

+ L

−L

t−1

, t ∈ R. (1.3)

The representations

(1.2)

–

(1.3)

are useful in many aspects, e.g., in the understanding

of the evolution of the process over time, to study properties of (

)

t∈R

through

observations of (

)

t∈R

or to compute prediction formulas (which, enventually, may be

used to estimate the models). Therefore, we aim at generalizing equations

(1.2)

–

(1.3)

in a suitable way and studying the corresponding solutions. Through this study we

will argue that these generalizations lead to a wide class of stationary processes,

which enjoy many of the same properties as the solutions to (1.2)–(1.3).

The two models of interest: Let

and

be ﬁnite signed measures concentrated

on [0

,∞

) and (0

,∞

), respectively, and let

θ : R → R

be some measurable function

(typically chosen to have a particularly simple structure) which is integrable with

respect to (

)

t∈R

. Moreover, suppose that (

)

t∈R

is a measurable and integrable

process with stationary increments. The equations of interest are

[0,∞)

t−u

η(du) dt + dZ

, t ∈ R, (1.4)

and X

∞

t−u

φ(du) +

−∞

θ(t −u) dL

, t ∈ R. (1.5)

We see that

(1.2)

is a special case of

(1.4)

with

−λδ

and

, and

(1.3)

is a

special case of

(1.5)

with

αδ

and

(0,1]

. Here

refers to the Dirac measure

1 · Introduction

c ∈ R

. Equation

(1.4)

is known in the literature as a stochastic delay diﬀerential

equation (SDDE), and existence and (distributional) uniqueness results have been

obtained when

is compactly supported and (

)

t∈R

is a Lévy process (see [13, 16]).

As indicated above, models of the type

(1.4)

are useful for recovering the increments

of (Z

)

t∈R

as well as prediction and estimation. We refer to [7, 17, 21] for details.

Another generalization of the noise term is given in [24]. Other parametrizations

(1.5)

that we will study in Examples 3.4 and 3.6 are

(

) =

αe

−βt

[0,∞)

(

)

and

j=1

for

α,φ

∈ R

and

β >

0. As far as we know, equations of the type

(1.5)

have not been studied before. We will refer to

(1.5)

as a level model, since it

speciﬁes

directly (rather than its increments,

−X

). Although the level model

may seem odd at ﬁrst glance as the noise term is forced to be stationary, one of its

strengths is that it can be used as a model for the increments of a stationary increment

process. We present this idea in Example 3.5 where a stationary increment solution

to (1.4) is found when no stationary solution exists.

Our main results: In Section 2 we prove existence and uniqueness in the model

(1.4)

under the assumptions that

[0,∞)

|η|(du) < ∞ and iy −

[0,∞)

−iuy

η(du) , 0

for all

y ∈ R

(

|η|

being the variation of

). In relation to this result we provide several

examples of choices of

and (

)

t∈R

. Among other things, we show that long memory

in the sense of a hyperbolically decaying autocovariance function can be incorporated

through the noise process (

)

t∈R

, and we indicate how invertible CARMA processes

can be viewed as solutions to SDDEs. Moreover, in Corollary 2.6 it is observed that as

long as (Z

)

t∈R

is of the form

[θ(t −u)−θ

(−u)] dL

, t ∈ R,

for suitable kernels

θ,θ

: R → R

, the solution to

(1.4)

is a moving average of the type

(1.1)

. On the other hand, Example 2.14 provides an example of (

)

t∈R

where the

solution is not of the form

(1.1)

. Next, in Section 3, we brieﬂy discuss existence and

uniqueness of solutions to

(1.5)

and provide a few examples. Section 4 contains some

technical results together with proofs of all the presented results.

Our proofs rely heavily on the theory of Fourier (and, more generally, bilateral

Laplace) transforms, in particular it concerns functions belonging to certain Hardy

spaces (or to slight modiﬁcations of such). Speciﬁc types of Musielak–Orlicz spaces

will also play an important role in order to show our results.

Deﬁnitions and conventions: For

p ∈

,∞

] and a (non-negative) measure

on the

Borel

-ﬁeld

(

) on

we denote by

(

) the usual

space relative to

. If

the Lebesgue measure, we will suppress the dependence on the measure and write

f ∈ L

. By a ﬁnite signed measure we refer to a set function

µ: B

(

)

→ R

of the form

−µ

−

, where

and

−

are two ﬁnite measures which are mutually singular.

Integration of a function

f : R → R

is deﬁned in an obvious way whenever

f ∈ L

(

|µ|

where |µ| B µ

+ µ

−

. For any given ﬁnite signed measure µ set and z ∈ C such that

−Re(z)t

|µ|(dt) < ∞,

Paper B · Stochastic delay diﬀerential equations and related autoregressive models

we deﬁne the bilateral Laplace transform L[µ](z) of µ at z by

L[µ](z) =

−zt

µ(dt).

In particular, the Fourier transform

[

](

)

B L

[

](

) is well-deﬁned for all

y ∈ R

(Note that the Laplace and Fourier transforms are often deﬁned with a minus in

the exponent; we have chosen this alternative deﬁnition so that

[

] coincides

with the traditional deﬁnition of the characteristic function.) If

f ∈ L

we deﬁne

[

]

B L

[

(

)

]. We note that

[

]

∈ L

when

f ∈ L

∩ L

and that

can be

extended to an isometric isomorphism from L

onto L

by Plancherel’s theorem.

For two ﬁnite signed measures µ and ν we deﬁne the convolution µ ∗ν as

µ ∗ν(B) =

(t + u)µ(dt)ν(du)

for any Borel set

. Moreover, if

f : R → R

is a measurable function such that

(

t− ·

)

∈

(|µ|) we deﬁne the convolution f ∗µ(t) at t ∈ R by

f ∗µ(t) =

f (t −u)µ(du).

Recall also that a process (

)

t∈R

= 0, is called a (two-sided) Lévy process if it

has stationary and independent increments and càdlàg sample paths (for details, see

[25]). Let (

)

t∈R

be a centered Lévy process with Gaussian component

and Lévy

measure ν. Then, for any measurable function f : R → R satisfying



f (u)



|xf (u)|∧|xf (u)|



ν(dx)



du < ∞, (1.6)

the integral of

with respect to (

)

t∈R

is well-deﬁned and belongs to

(

) (see [23,

Theorem 3.3]).

2 The SDDE setup

Recall that, for a given ﬁnite signed measure

on [0

,∞

) and a measurable process

(

)

t∈R

with stationary increments and

[

]

< ∞

for all

, we are interested in

the existence and uniqueness of a measurable and stationary process (

)

t∈R

with

E[|X

|] < ∞ which satisﬁes

−X

[0,∞)

u−v

η(dv) du + Z

−Z

(2.1)

almost surely for each s < t.

Remark 2.1.

In the literature,

(2.1)

is often solved on [0

,∞

) given an initial condition

(

)

u≤0

. However, since we will be interested in (possibly) non-causal solutions, it

turns out to be convenient to solve

(2.1)

with no initial condition (see [12, p. 46

and Section 3.2] for details).

2 · The SDDE setup

In line with [13], we will construct a solution as a convolution of (

)

t∈R

and a

deterministic kernel

: R → R

characterized through

. This kernel is known as

the diﬀerential resolvent (of

) in the literature. Although many (if not all) of the

statements of Lemma 2.2 concerning

should be well-known, we have not been

able to ﬁnd a precise reference, and hence we have chosen to include a proof here.

The core of Lemma 2.2 as well as further properties of diﬀerential resolvents can be

found in [12, Section 3.3].

In the formulation we will say that

has

th moment,

n ∈ N

, if

v 7→ v

∈ L

(

|η|

)

and that

has an exponential moment of order

δ ≥

0 if

v 7→ e

δv

∈ L

(

|η|

). Finally, we

will make use of the function

h(z) B z −L[η](z), (2.2)

which is always well-deﬁned for

z ∈ C

with

(

)

≥ −δ

admits an exponential

moment of order δ ≥ 0.

Lemma 2.2.

Suppose that

(

)

0 for all

y ∈ R

. Then there exists a unique function

: R → R

, which meets

u 7→ x

(

)

∈ L

for all

c ∈

[

−a,

0] and a suitably chosen

a >

and satisﬁes

(t) = 1

[0,∞)

(t) +

−∞

[0,∞)

(u −v)η(dv) du, t ∈ R. (2.3)

Furthermore,

is characterized by

[

](

) = 1

(

) for

z ∈ C

with

(

)

∈

), and the

following statements hold:

(i)

has

th moment for some

n ∈ N

, then (

u 7→ x

(

)

∈ L

. In particular,

∈ L

for all q ∈ [1/n,∞].

(ii)

has an exponential moment of order

δ >

0, then there exists

ε ∈

,δ

] such that

u 7→ x

(u)e

∈ L

for all c ∈ [−a,ε] and, in particular, x

∈ L

for all q ∈ (0,∞].

(iii) If h(z) , 0 for all z ∈ C with Re(z) ≥ 0, then x

(t) = 0 for all t < 0.

(2.3)

it follows that

induces a Lebesgue–Stieltjes measure

. From Lemma 2.2

we deduce immediately the following properties of µ

Corollary 2.3.

Suppose that

(

)

0 for all

y ∈ R

. Then

deﬁnes a Lebesgue–Stieltjes

measure, and it is given by

(du) = δ

(du) +



[0,∞)

(u −v)η(dv)



du.

A function θ : R → R is integrable with respect to µ

if and only if

[0,∞)

|θ(u + v)x

(u)| du |η|(dv) < ∞. (2.4)

Example 2.4.

Let the setup be as in Corollary 2.3. We will here discuss a few impli-

cations of this result.

Paper B · Stochastic delay diﬀerential equations and related autoregressive models

(i)

Suppose that

has

th moment for some

n ∈ N

. By using the inequality

n−1

≤ 2

n−1

(|u|

n−1

+ |v|

n−1

) we establish that

n−1

[0,∞)

|(u + v)

n−1

(u)| du |η|(dv)

≤ |η|([0,∞))

(u)u

n−1

| du +

[0,∞)

|v|

n−1

|η|(dv)

(u)| du.

(2.5)

The last term on the right-hand side of

(2.5)

is ﬁnite, since

∈ L

according to

Lemma 2.2(i). The Cauchy–Schwarz inequality and the same lemma once again

imply



|u|>1

(u)u

n−1

| du



≤

|u|>1



(u)u



|u|>1

−2

du < ∞.

Consequently, since

u 7→ x

(

)

n−1

is locally bounded, we deduce that (

u 7→

(

)

n−1

)

∈ L

and that the ﬁrst term on the right-hand side of

(2.5)

is also

ﬁnite. It follows that

(2.4)

is satisﬁed for

(

) =

|u|

n−1

, so

has moments up

to order n −1.

(ii)

Suppose that

has an exponential moment of order

δ >

0. Let

be any number

in (0

,ε

), where

ε ∈

,δ

) is chosen as in Lemma 2.2(ii). With this choice it is

straightforward to check that (u 7→ x

(u)e

γu

) ∈ L

, and hence

[0,∞)

γ(u+v)

(u)| du |η|(dv) =

[0,∞)

γu

|η|(dv)

(u)|e

γu

du < ∞.

This shows that

(2.4)

holds with

(

) =

γu

, so

has as an exponential moment

of order γ > 0.

(iii)

Whenever

has ﬁrst moment,

is bounded (cf. Lemma 2.2(i)). Thus, under

this assumption, a suﬃcient condition for (2.4) to hold is that θ ∈ L

With the diﬀerential resolvent in hand we present our main result of this section:

Theorem 2.5.

Let (

)

t∈R

be a measurable process which has stationary increments and

satisﬁes

[

]

< ∞

for all

. Suppose that

is a ﬁnite signed measure with second moment

and h(iy) , 0 for all y ∈ R. Then the process

= Z

t−u

[0,∞)

(u −v)η(dv) du, t ∈ R, (2.6)

is well-deﬁned and the unique integrable stationary solution (up to modiﬁcation) of

equation

(2.1)

. If

(

)

0 for all

z ∈ C

with

(

)

≥

0, (

)

t∈R

admits the following causal

representation:

∞

t−u

−Z

]

[0,∞)

(u −v)η(dv) du, t ∈ R. (2.7)

Often, (Z

)

t∈R

is given by

[θ(t −u)−θ(−u)] dL

, t ∈ R, (2.8)

2 · The SDDE setup

for some integrable Lévy process (

)

t∈R

with

[

] = 0 and measurable function

θ : R → R

such that

u 7→ θ

(

)

−θ

(

) satisﬁes

(1.6)

for

t >

0. The next result shows

that the (unique) solution to

(2.1)

is a Lévy-driven moving average in this particular

setup.

Corollary 2.6.

Let the setup be as in Theorem 2.5 and suppose that (

)

t∈R

is of the form

(2.8). Then the unique integrable and stationary solution to (2.1) is given by

θ ∗µ

(t −u) dL

, t ∈ R. (2.9)

In particular if Z

= L

for t ∈ R, we have that

(t −u) dL

, t ∈ R.

Remark 2.7.

Let the situation be as in Corollary 2.6 with

(

)

0 whenever

(

)

≥

In this case we know from Theorem 2.5 that (

)

t∈R

has the causal representation

(2.7)

with respect to (

)

t∈R

. Now, if (

)

t∈R

is causal with respect to (

)

t∈R

in the

sense that

(

) = 0 for

t <

0, (

)

t∈R

admits the following causal representation with

respect to (L

)

t∈R

−∞

θ ∗µ

(t −u) dL

, t ∈ R.

This follows from

(2.9)

and the fact that

θ ∗µ

(

) = 0 for

t <

0 (using Lemma 2.2(iii)).

Remark 2.8.

The assumption

(0) =

−η

([0

,∞

))

0 is rather crucial in order to ﬁnd

stationary solutions. It may be seen as the analogue of assuming that the AR coeﬃ-

cients in a discrete-time ARMA setting do not sum to zero. For instance, the setup

where

η ≡

0 will satisfy

(

)

0 for all

y ∈ R \{

}

, but if (

)

t∈R

is a Lévy process,

the SDDE

(2.1)

cannot have stationary solutions. In Example 3.5, we show how one

can ﬁnd solutions with stationary increments for a reasonably large class of delay

measures η with η([0,∞)) = 0.

Remark 2.9.

It should be stressed that for more restrictive choices of

, and in case

(

)

t∈R

is a Lévy process, solutions sometimes exist even when

[

] =

∞

. Indeed,

is compactly supported and

(

)

≥

0 implies

(

)

0, one only needs that

[

log

]

< ∞

to ensure that a stationary solution exists. We refer to [13, 24] for

further details.

We now present some concrete examples of SDDEs. The ﬁrst three examples con-

cern the speciﬁcation of the delay measure and the last two concern the speciﬁcation

of the noise.

Example 2.10. Let λ , 0 and consider the equation

−X

= −λ

du + Z

−Z

, s < t. (2.10)

In the setup of

(2.1)

this corresponds to

−λδ

. With

given by

(2.2)

, we have

(

) =

λ ,

0 for every

z ∈ C

with

(

)

, −λ

, and hence Theorem 2.5 implies that

there exists a stationary process (

)

t∈R

with

[

]

< ∞

satisfying

(2.10)

. According

Paper B · Stochastic delay diﬀerential equations and related autoregressive models

to Lemma 2.2 the diﬀerential resolvent

can be determined through its Laplace

transform on {z ∈ C : 0 < Re(z) < a} for a suitable a > 0 as

L[x

](z) =

z + λ











L[1

[0,∞)

−λ ·

](z) if λ > 0,

L[−1

(−∞,0)

−λ ·

](z) if λ < 0.

Consequently, by Theorem 2.5,











−λe

−λt

−∞

λu

du if λ > 0,

+ λe

−λt

∞

λu

du if λ < 0.

(2.11)

Ornstein–Uhlenbeck processes satisfying

(2.10)

have already been studied in the

literature, and representations of the stationary solution have been given, see e.g. [2,

Theorem 2.1, Proposition 4.2].

Example 2.11.

Let (

)

t∈R

be a Lévy process with

[

]

< ∞

. Recall that (

)

t∈R

said to be a CARMA(2,1) process if

−∞

g(t −u) dL

, t ∈ R,

where the kernel g is characterized by

F [g](y) =

iy + b

−y

+ a

iy + a

, y ∈ R,

for suitable

∈ R

, such that

z 7→ z

has no roots on

{z ∈ C

(

)

≥

}

To relate the CARMA(2

1) process to a solution to an SDDE we will suppose that the

invertibility assumption

0 is satisﬁed. In particular,

0 for all

y ∈ R

and,

thus, we may write

F [g](y) =

iy + a

−b

)

iy+b

, y ∈ R.

By choosing

(

) = (

−a

)

(

)

−

(

−b

(

−b

))

−b

[0,∞]

(

)

(a ﬁnite signed

measure with exponential moment of any order

δ < b

) it is seen that the function

given in

(2.2)

satisﬁes 1

(

) =

[

](

) for

y ∈ R

. Consequently, we conclude from

Theorem 2.5 that the CARMA(2

1) process with parameter vector (

) is the

unique solution to the SDDE

(2.1)

with delay measure

. In fact, any CARMA(

p, q

)

process (

p, q ∈ N

and

p > q

) satisfying a suitable invertibility condition can be

represented as the solution to an equation of the SDDE type. See [7, Theorem 4.8] for

a precise statement.

Example 2.12.

In this example we consider a delay measure

where the correspond-

ing solution to the SDDE in

(2.1)

may be regarded as a CARMA process with fractional

polynomials. Speciﬁcally, consider

η(dt) = α

(dt) +

Γ (β)

[0,∞)

(t)t

β−1

−γt

dt,

where

β, γ >

0 and

is the gamma function. In this case,

(

) =

z −α

−α

(

)

−β

and hence

is of the form

(

)

(

), where

(

) =

for suitable

2 · The SDDE setup

constants

0 and

∈ R

. In this way, one may think of

as a ratio of fractional

polynomials (recall from Example 2.11 that the solution to

(2.1)

will sometimes be a

regular CARMA process when

β ∈ N

). By Lemma 2.2 and Theorem 2.5 the associated

SDDE has a unique solution with diﬀerential resolvent

satisfying

(

) = 0 for

t <

Re(z) ≥0 =⇒ z −α

−α

(z + γ)

−β

, 0. (2.12)

Each of the following two cases is suﬃcient for (2.12) to be satisﬁed:

(i) α

+ |α

|γ

−β

< 0: In this case we have in particular that α

< 0, so

|z −α

−α

(z + γ)

−β

| ≥ −α

−|α

||(z + γ)

−β

| ≥ −α

−|α

|γ

−β

> 0

whenever Re(z) ≥ 0.

(ii) α

,α

0 and

β <

1: In this case

((

)

−β

)

0 and, thus,

(

z − α

−

(z + γ)

−β

) > 0 as long as Re(z) ≥0.

Example 2.13.

Let

be any ﬁnite signed measure with second moment, which satis-

ﬁes

(

)

0 for all

y ∈ R

. Consider the case where (

)

t∈R

is a fractional Lévy process,

that is,

Γ (1 + d)

[(t −u)

−(−u)

] dL

, t ∈ R,

where d ∈ (0, 1/2) and (L

)

t∈R

is a centered and square integrable Lévy process. Let

θ(t) =

Γ (1 + d)

, t ∈ R.

Then it follows by Corollary 2.6 that the solution to (2.1) takes the form

θ ∗µ

(t −u) dL

, t ∈ R.

It is not too diﬃcult to show that

θ ∗ µ

coincides with the left-sided Riemann–

Liouville fractional integral of

, and hence

(

t −u

)

, where the integral

with respect to (

)

t∈R

is deﬁned as in [18]. Consequently, we can use the proof

of [18, Theorem 6.3] to deduce that (

)

t∈R

has long memory in the sense that its

autocovariance function is hyperbolically decaying at ∞:

(h) B E[X

] ∼

Γ (1 −2d)

Γ (d)Γ (1 −d)

E[L

]

h(0)

2d−1

, h → ∞. (2.13)

In particular, (2.13) shows that γ

< L

Our last example, presented below, deals with a situation where Theorem 2.5 is

applicable, but (

)

t∈R

is not of the form

(2.8)

. It is closely related to [2, Corollary 2.3].

Example 2.14.

Let (

)

t∈R

be a Brownian motion with respect to a ﬁltration (

)

t∈R

Moreover, let (

)

t∈R

be a predictable process with

∈ L

(

), and assume that

(

)

t∈R

and (

t+u

− B

)

t∈R

have the same ﬁnite-dimensional marginal dis-

tributions for all u ∈ R. In this case

, t ∈ R,

Paper B · Stochastic delay diﬀerential equations and related autoregressive models

is well-deﬁned, continuous and square integrable, and it has stationary increments.

Here we use the convention

B −

when

t <

0. Under the assumptions that

(

)

for all

z ∈ C

with

(

)

≥

0 and

has second moment, Theorem 2.5 implies that there

exists a unique stationary solution (

)

t∈R

(2.1)

and, since

(

) = 0 for

t <

0, it is

given by

= Z

∞

t−s

[0,∞)

(s −v)η(dv) ds

= −

∞

t−s

[0,∞)

(s −v)η(dv) ds

= −

−∞

∞

t−u

[0,∞)

(s −v)η(dv) ds dB

−∞

(t −u)σ

for

t ∈ R

, where we have used Corollary 2.3,

(4.9)

and an extension of the stochastic

Fubini given in [22, Chapter IV, Theorem 65] to integrals over unbounded intervals.

3 The level model

In this section we consider the equation

∞

t−u

φ(du) +

−∞

θ(t −u) dL

, t ∈ R, (3.1)

where

is a ﬁnite signed measure on (0

,∞

), (

)

t∈R

is an integrable Lévy process

with

[

] = 0 and

θ : R → R

is a measurable function, which vanishes on (

−∞,

and satisﬁes (1.6).

Remark 3.1.

Due to the extreme ﬂexibility of the model

(3.1)

, one should require

that

and

take a particular simple form. To elaborate, under the assumptions

of Theorem 3.2 or Remark 3.3, a solution to

(3.1)

associated to the pair (

φ,θ

) is a

causal moving average with kernel

. On the other hand, this solution could also

have been obtained using the pair (0

,ψ

). However, it might be that

and

have a

simple form while

has not, and hence

(3.1)

should be used to obtain parsimonious

representations of a wide range of processes. This idea is similar to that of the discrete-

time stationary ARMA processes, which could as well have been represented as an

MA(∞) process or (under an invertibility assumption) an AR(∞) process.

Equation

(3.1)

can be solved using the backward recursion method under the

contraction assumption

|φ|

((0

,∞

))

1, and this is how we obtain Theorem 3.2. For the

noise term we will put the additional assumption that

[

]

< ∞

, and hence (in view

(1.6)

) that

θ ∈ L

. In the formulation we will denote by

∗n

the

-fold convolution

of φ, that is, φ

∗n

B φ ∗···∗φ for n ∈ N and φ

∗0

= δ

Theorem 3.2.

Let (

)

t∈R

be a Lévy process with

[

] = 0 and

[

]

< ∞

, and suppose

that

θ ∈ L

, and

|φ|

((0

,∞

))

1. Then there exists a unique square integrable solution to

(3.1). It is given by

−∞

ψ(t −u) dL

, t ∈ R,

3 · The level model

where ψ B

∞

n=0

θ ∗φ

∗n

exists as a limit in L

and vanishes on (−∞,0).

Remark 3.3.

One can ask for the existence of solutions to

(3.1)

under weaker condi-

tions on

than

|φ|

((0

,∞

))

1 (as imposed in Theorem 3.2). In particular, suppose

still that

[

] = 0,

[

]

< ∞

and

θ ∈ L

, but instead of

|φ|

((0

,∞

))

1 suppose for

some a > 0 that L[φ](z) , 1 whenever Re(z) ∈(0,a) and

sup

0<x<a



L[θ](x + iy)

1 −L[φ](x + iy)



dy < ∞. (3.2)

Under these assumptions one can ﬁnd a function

ψ ∈ L

, such that (

u 7→ e

(

))

∈ L

for all c ∈ (0,a) and

L[ψ](z) =

L[θ](z)

1 −L[φ](z)

, 0 < Re(z) < a. (3.3)

This is shown in Lemma 4.1. For this

it follows that

[

](

) =

[

](

)

[

](

) +

L[θ](z), and hence

L[ψ(t − · )](−z) = e

(L[ψ](z)L[φ](z) + L[θ](z))

= L



∞

ψ(t −u − · )φ(du) + θ(t − · )



(−z)

for each ﬁxed

t ∈ R

and all

z ∈ C

with

(

)

∈

). By uniqueness of Laplace

transforms, this establishes that

ψ(t −r) =

∞

ψ(t −u −r)φ(du) + θ(t −r) (3.4)

for Lebesgue almost all

r ∈ R

and each ﬁxed

t ∈ R

. By integrating both sides of

(3.4)

with respect to

and using a stochastic Fubini result (e.g., [2, Theorem 3.1]) it

follows that the moving average X

ψ(t −r) dL

, t ∈ R, is a solution to (3.1).

To see that the conditions on

imposed here are weaker than

|φ|

((0

,∞

))

as imposed in Theorem 3.2, observe that

[

](

)

1 whenever

(

)

∈

) by the

inequality |L[φ](z)| ≤ |φ|((0, ∞)), and

sup

0<x<a



L[θ](x + iy)

1 −L[φ](x + iy)



dy ≤

2π

(1 −|φ|((0,∞)))

∞

θ(u)

du. (3.5)

(3.5)

we have made use of Plancherel’s Theorem. Suppose that

|φ|

((0

,∞

))

1 so

that Theorem 3.2 is applicable, let

be deﬁned through

(3.3)

and set

∞

n=0

θ ∗φ

∗n

Then it follows by uniqueness of solutions to

(3.1)

and the isometry property of the

integral map that

0 = E



ψ(t −u) dL

−

ψ(t −u) dL





= E[L

]

(ψ(u)−

ψ(u))

du.

This shows that

almost everywhere and that

∞

n=0

θ ∗ φ

∗n

is an alternative

characterization of

when

|φ|

((0

,∞

))

1. Another argument, which does not rely on

the uniqueness of solutions to

(3.1)

, would be to show that

and

have the same

Fourier transform.

Paper B · Stochastic delay diﬀerential equations and related autoregressive models

Example 3.4.

Suppose that (

)

t∈R

is a Lévy process with

[

] = 0 and

[

]

< ∞

and let

θ ∈ L

. For

α ∈ R

and

β >

0, consider

(

) =

αe

−βt

[0,∞)

(

)

and deﬁne the

measure ξ(dt) = e

αt

φ(dt) = αe

−(β−α)t

[0,∞)

(t) dt.

We will argue that a solution to

(3.1)

exists as long as

α/β <

1 by considering the

two cases (i) −1 < α/β < 1 and (ii) α/β ≤ −1 separately.

(i) −

< α/β <

1: In this case

|φ|

((0

,∞

)) =

|α|/β <

1, and the existence of a solution

is ensured by Theorem 3.2. To determine the solution kernel

, note that

∗n

(du) =

(n−1)!

n−1

−βu

[0,∞)

(u) du for n ∈ N and, thus,

n=0

θ ∗φ

∗n

(t) = θ(t)+ α

θ(t −u)e

−βu

N−1

n=0

(αu)

du → θ(t) + θ ∗ξ(t)

N → ∞

by Lebesgue’s theorem on dominated convergence. This shows that

ψ = θ + θ ∗ξ.

(ii) α/β ≤ −

1: In this case

|φ|

((0

,∞

))

≥

1, so Theorem 3.2 does not apply. However,

observe that L[φ](z) = α/(z + β) , 1 and

L[θ](z)

1 −L[φ](z)

L[θ](z)

1 −α

z+β

= L[θ](z) + L[θ](z)

z + β −α

= L[θ + θ ∗ξ](z)

when Re(z) > 0. The latter observation shows that

sup

x>0



L[θ](x + iy)

1 −L[φ](x + iy)



dy ≤2π

(θ(u)+ θ ∗ξ(u))

du < ∞

by Plancherel’s theorem. Now Remark 3.3 implies that a solution to

(3.1)

also

exists in this case and ψ = θ + θ ∗ξ is the solution kernel.

The next example relates (3.1) to (2.1) in a certain setup.

Example 3.5.

We will give an example of an SDDE where Theorem 2.5 does not pro-

vide a solution, but where a solution can be found by considering an associated level

model. Consider the SDDE model

(2.1)

in the case where

is absolutely continuous

and its cumulative distribution function F

(t) B η([0,t]), t ≥ 0, satisﬁes

∞

(t)| dt < 1. (3.6)

This means in particular that

([0

,∞

)) =

lim

t→∞

(

) = 0, and hence

deﬁned in

(2.2)

satisﬁes

(0) = 0 and Theorem 2.5 does not apply (cf. Remark 2.8). In fact, using

a stochastic Fubini theorem (such as [2, Theorem 3.1]) and integration by parts on

the delay term, the equation may be written as

−X

∞

t−u

−X

s−u

(u) du + Z

−Z

, s < t. (3.7)

This shows that uniqueness does not hold, since if (

)

t∈R

is a solution then so is

(

)

t∈R

for any

ξ ∈ L

(

). Moreover, as noted in Remark 2.8, we cannot expect to

3 · The level model

ﬁnd stationary solutions in this setup. In the following let us restrict the attention to

the case where

[f (t −u) −f

(−u)] dL

, t ∈ R,

for a given Lévy process (

)

t∈R

with

[

] = 0 and

[

]

< ∞

, and for some functions

f ,f

: R → R

, vanishing on (

−∞,

0), such that

u 7→ f

(

)

− f

(

) belongs to

Using Theorem 3.2 we will now argue that there always exists a centered and square

integrable solution with stationary increments in this setup and that the increments

of any two such solutions are identical.

To show the uniqueness part, suppose that (

)

t∈R

is a centered and square inte-

grable stationary increment process which satisﬁes

(3.7)

. Then, for any given

s >

we have that the increment process

(

)

−X

t−s

t ∈ R

, is a stationary, centered

and square integrable solution to the level equation

(3.1)

with

(

) =

(

)

and

θ = f −f ( · −s). By the uniqueness part of Theorem 3.2 and (3.6) it follows that

X(s)

(t −u) dL

, t ∈ R,

where

(

) =

∞

n=0

∞

[

(

t −u

)

−f

(

t −s −u

)]

∗n

(

) (the sum being convergent in

Consequently, by a stochastic Fubini result, (X

)

t∈R

must take the form

= ξ +

∞

n=0

∞

t−u

−Z

−u

]φ

∗n

(du), t ∈ R, (3.8)

for a suitable

ξ ∈ L

(

) with

[

] = 0. Conversely, if one deﬁnes (

)

t∈R

(3.8)

can use the same reasoning as above to conclude that (

)

t∈R

is a stationary increment

solution to

(2.1)

. It should be stressed that one can ﬁnd other representations of

the solution than

(3.8)

(e.g., in a similar manner as in Example 3.4). For more on

non-stationary solutions to (2.1), see [20].

A nice property of the model

(3.1)

is that it is possible to recover the discrete-time

ARMA(

p, q

) process. Example 3.6 gives (well-known) results for ARMA processes by

using Remark 3.3. For an extensive treatment of ARMA processes, see e.g. [8].

Example 3.6. Let p,q ∈ N

and deﬁne the polynomials Φ,Θ : C → C by

Φ(z) = 1 −φ

z −···−φ

and Θ(z) = 1 + θ

z + ···+ θ

where the coeﬃcients are assumed to be real. Let (

)

t∈R

be a Lévy process with

[

] = 0 and

[

]

< ∞

, and consider choosing

(

) =

j=1

(

) and

(

) =

[0,1)

(u) +

j=1

[j,j+1)

(u). In this case (3.1) reads

i=1

t−i

+ Z

i=1

t−i

, t ∈ R, (3.9)

with

−L

t−1

. In particular, if (

)

t∈R

is a solution to

(3.9)

, (

)

t∈Z

is a usual

ARMA process. Suppose that

(

)

0 for all

z ∈ C

with

|z|

= 1. Then, by continuity

, there exists

a >

0 such that 1

−L

[

](

) =

(

−z

) is strictly separated from 0 for

z ∈ C

with

(

)

∈

). Thus, since

θ ∈ L

, Remark 3.3 implies that there exists a

Paper B · Stochastic delay diﬀerential equations and related autoregressive models

stationary solution to

(3.1)

, and it is given by

(

t −u

)

t ∈ R

, where

characterized by (3.3). Choose a small ε > 0 and (ψ

)

j∈Z

so that the relation

Θ(z)

Φ(z)

∞

j=−∞

holds true for all z ∈ C with 1 −ε < |z| < 1 + ε. Then

L[ψ](z) = L[1

[0,1)

](z)

Θ(e

−z

)

Φ(e

−z

)

∞

j=−∞

L[1

[j,j+1)

](z) = L



∞

j=−∞

[j,j+1)



(z)

for all

z ∈ C

with a positive real part suﬃciently close to zero. Thus, we have the

well-known representation X

∞

j=−∞

t−j

for t ∈ R.

4 Proofs and technical results

The ﬁrst result is closely related to the characterization of the so-called Hardy spaces

and some of the Paley–Wiener theorems. For more on these topics, see e.g. [11, Section

2.3] and [15, Chapter VI (Section 7)]. We will use the notation

a,b

{z ∈ C

a <

Re(z) < b} throughout this section.

Lemma 4.1.

Let

−∞ ≤ a < b ≤ ∞

. Suppose that

F : C → C

is a function which is analytic

on the strip S

a,b

and satisﬁes

sup

a<x<b

|F(x + iy)|

dy < ∞. (4.1)

Then there exists a function

f : R → C

such that (

u 7→ f

(

)

∈ L

for

c ∈

(

a,b

), (

u 7→

f (u)e

) ∈ L

for c ∈ [a,b], and

f (u) du = F(z) for z ∈ S

a,b

Remark 4.2.

−∞

, the property

u 7→ f

(

)

∈ L

is understood as

(

) = 0 for

almost all

u <

0 and similarly,

(

) = 0 for almost all

u >

0 if

u 7→ f

(

)

∈ L

for

b = ∞.

Proof of Lemma 4.1. Fix c

∈ (a,b) with c

< c

. For any y > 0 and u ∈R, consider

(anti-clockwise) integration of

z 7→ e

−zu

(

) along a rectangular contour

with

vertices c

−iy, c

+ iy, and c

+ iy:

0 =

−zu

F(z) dz

−(x−iy)u

F(x −iy) dx + ie

−c

−y

−ixu

F(c

+ ix) dx

−

−(x+iy)u

F(x + iy) dx −ie

−c

−y

−ixu

F(c

+ ix) dx.

(4.2)

Since



−(x+iy)u

F(x + iy)dx



≤ e

−2(c

u∧c

−c

)

sup

a<x<b

|F(x + iy)|

dy < ∞,

4 · Proofs and technical results

we deduce the existence of a sequence (y

)

n∈N

⊆ (0,∞), such that y

→ ∞ and

−(x±iy

F(x ±iy

)dx −→ 0.

Furthermore, for k = 1,2 it holds that



u 7−→

−y

−ixu

F(c

+ ix) dx



−→



u 7−→F [F(c

+ i · )](u)



y → ∞

by Plancherel’s theorem. In particular, this convergence holds along

the sequence (

)

n∈N

and, eventually by only considering a subsequence of (

)

n∈N

we may also assume that

lim

n→∞

−y

−ixu

F(c

+ ix) dx = F [F(c

+ i · )](u), k = 1,2,

for Lebesgue almost all

u ∈ R

. Combining this with

(4.2)

yields

−c

[

(

i ·

)](

) =

−c

[

(

i ·

)](

) for almost all

u ∈ R

. Consequently, there exists a function

f : R → C

with the property that

(

) = (2

)

−1

−cu

[

(

i ·

)](

) for almost all

u ∈ R

for any given c ∈ (a,b). For such c we compute

f (u)|

du = (2π)

−2

|F [F(c + i · )](u)|

du ≤ sup

x∈(a,b)

|F(x + iy)|

dy < ∞.

Consequently, (

u 7→ f

(

)

∈ L

for any

c ∈

(

a,b

) and by Fatou’s Lemma, this holds

as well for

and

. Furthermore, if

c ∈

(

a,b

), we can choose

ε >

0 such that

c ±ε ∈ (a,b) as well, from which we get that



|f (u)|e



≤



∞

|f (u)e

(c+ε)u

du +

−∞

|f (u)e

(c−ε)u



∞

−2εu

du < ∞

by Hölder’s inequality. This shows that (

u 7→ f

(

)

∈ L

. Finally, we ﬁnd for

x + iy ∈ S

a,b

(by deﬁnition of f ) that

L[f ](z) =

iyu

f (u)du = F

−1

[F [F(x + i · )]](y) = F(z),

and this completes the proof. 

Proof of Lemma 2.2.

Observe that, generally,

(

)

0 if

(

)

≥

0 and

|z| > |η|

([0

,∞

))

and thus, under the assumption that

(

)

0 for all

y ∈ R

and by continuity of

there must be an

a >

0 such that

(

)

0 for all

z ∈ S

0,a

. The fact that

(

)

| ∼ |z|

|z| → ∞

when

(

)

≥

0 and, once again, the continuity of

imply that

(4.1)

is satisﬁed

for 1

(

− ·

) (over the interval (

−a,

0)), and thus we get the existence of a function

: R → R

such that

[

] = 1

0,a

and

t 7→ e

(

)

∈ L

for all

c ∈

[

−a,

0].

Observe that this gives in particular that

(−∞,0]

∈ L

and thus, since

∈ L

, we

also get that

(−∞,t]

∈ L

for all t ∈ R. This ensures that x

: R → R given by

(t) = 1

[0,∞)

(t) +

−∞

[0,∞)

(u −v)η(dv) du, t ∈ R,

Paper B · Stochastic delay diﬀerential equations and related autoregressive models

is a well-deﬁned function. To establish the ﬁrst part of the statement (in particular

(2.3)

) it suﬃces to argue that

[

] = 1

0,a

. However, this follows from the

following calculation, which holds for an arbitrary z ∈ S

0,a



[0,∞)

−∞

[0,∞)

(u −v)η(dv) du



(z)

= z

−1

1 + L[

](z)L[η](z)

= z

−1

z −L[η](z)

h(z)

Suppose now that η has nth moment for some n ∈ N and note that

h(iy)| ≤ 1 +

[0,∞)

|η|(dv) < ∞,

for

k ∈ {

,... , n}

(

denoting the

th order derivative with respect to

). Since

(

)] will be a sum of terms of the form

(

)

(

)

l,m

= 1

,... , k

, and

(

y 7→

(

))

∈ L

, this means in turn that

(

i ·

)]

∈ L

for

= 1

,... , n

. Since

−1

maps

functions to

functions,

−1

[

(

i ·

)]]

∈ L

. Moreover, it is well-known

that if

f ,Df ∈ L

, we have the formula

−1

[

](

) =

−iuF

−1

[

](

) for

u ∈ R

, and

by an approximation argument it holds when

f ,Df ∈ L

as well (although only for

almost all u), cf. [1, Corollary 3.23]. Hence, by induction we establish that

−1



h(i · )



(u) = (−iu)

−1



h(i · )



(u) = (−iu)

(u).

This shows the ﬁrst part of (i). For any given

q ∈

/n,

2) it follows by Hölder’s

inequality that

(u)|

≤





(u)(1 + |u|

)





q/2



(1 + |u|

)

−2q/(2−q)



1−q/2

< ∞,

which shows

∈ L

. By using the relation

(2.3)

, which was veriﬁed just above, we

obtain

(t)| ≤ 1 +

−∞

[0,∞)

(u −v)||η|(dv) du ≤ |η|([0,∞))

(u)| du.

Since

∈ L

, the inequalities above imply

∈ L

∞

, and thus we get

∈ L

for

q ∈

/n,∞

], which shows the second part of (i). If

has an exponential moment of

order

then we can ﬁnd

ε ∈

,δ

) such that 1

(

− ·

) satisﬁes

(4.1)

over the interval

(

−a,ε

) and therefore, we have that

u 7→ x

(

)

∈ L

for

c ∈

[

−a,ε

]. If

(

)

0 for all

z ∈ C

with

(

)

≥

0 we can argue that

(4.1)

holds for 1

(

− ·

) with

−∞

and

= 0

in the same way as above and, thus, Lemma 4.1 implies x

(u) = 0 for u < 0. 

The following lemma is used to ensure uniqueness of solutions to (2.1):

Lemma 4.3.

Fix

s ∈ R

. Suppose that

(

)

0 for all

y ∈ R

and that, given (

)

t≤s

, a

process (X

)

t∈R

satisﬁes











[0,∞)

u−v

η(dv) du if t ≥ s,

if t < s.

(4.3)

4 · Proofs and technical results

almost surely for each

t ∈ R

(the

-null sets are allowed to depend on

) and

sup

t∈R

[

]

∞. Then

= X

(t −s) +

∞

(u−s,∞)

u−v

η(dv)x

(t −u) du

for Lebesgue almost all t ≥ s outside a P-null set.

Proof.

Observe that, by Fubini’s theorem, we can remove a

-null set and have that

(4.3)

is satisﬁed for Lebesgue almost all

t ∈ R

. Let

a >

0 be such that

(

)

0 for all

z ∈ S

0,a

(this is possible due to the assumption h(iy) , 0 for all y ∈ R). Note that



∞

−n

−1

| dt



≤ nsup

t∈R

E[|X

|] < ∞

for any given

n ∈ N

by Tonelli’s theorem. This means that

∞

−n

−1

| dt < ∞

for all

almost surely and, hence,

[

[s,∞)

] is well-deﬁned on

0,a

outside a

-null set. For

z ∈ S

0,a

we compute

L[X1

[s,∞)

](z) = L



[s,∞)



[0,∞)

u−v

η(dv) du



(z)

−zs

∞

−zt

[0,∞)

u−v

η(dv) du dt

−zs

[0,∞)

∞

u−v

∞

−zt

dt du η(dv)



−zs

[0,∞)

∞

s−v

−z(u+v)

du η(dv)





−zs

+ L[η](z)L[X1

[s,∞)

](z)

+ L



[s,∞)

( · −s,∞)

· −v

η(dv)



(z)



In the calculations above we have used Fubini’s theorem several times; speciﬁcally, in

the third and ﬁfth equality. These calculations are valid (at least after removing yet

another

-null set) by the same type of argument as used to establish that

[

[s,∞)

]

is well-deﬁned on

0,a

almost surely. For instance, Fubini’s theorem is applicable in

the third equality above for any z ∈ S

0,a

almost surely, since



∞

[0,∞)

−n

−1

u−v

||η|(dv) du dt



= |η|([0,∞))

∞

(t −s)e

−n

−1

dt sup

t∈R

E[|X

|] < ∞

for an arbitrary

n ∈ N

. Returning to the computations, we ﬁnd by rearranging terms

that

L[X1

[s,∞)

](z) =

−zs

h(z)

[s,∞)

( · −s,∞)

· −v

η(dv)

(z)

h(z)

. (4.4)

By applying the expectation operator, we note that

∞

(u−s,∞)

u−v

||η|(dv)|x

(t −u)| du < ∞ (4.5)

Paper B · Stochastic delay diﬀerential equations and related autoregressive models

almost surely for each

t ∈ R

∞

|η|

((

u −s,∞

))

(

t −u

)

| du < ∞

. Since

|η|

([0

,∞

))

< ∞

it is suﬃcient that

(−∞,t−s]

∈ L

, but this is indeed the case (see the beginning of

the proof of Lemma 2.2). Consequently, Tonelli’s theorem implies that

(4.5)

holds for

Lebesgue almost all

t ∈ R

outside a

-null set. Furthermore, again by Lemma 2.2,

there exists ε > 0 such that

−εt

∞

(t −u)| du dt =

−εt

(t)| dt

∞

−εu

du < ∞.

From this it follows that, almost surely,

∞

(u−s,∞)

u−v

(

)

(

t − u

)

is well-

deﬁned and that its Laplace transform exists on S

0,ε

. We conclude that



∞

(u−s,∞)

u−v

η(dv)x

( · −u) du



(z) =

[s,∞)

( · −s,∞)

· −v

η(dv)

(z)

h(z)

for

z ∈ S

0,ε

, and the result follows since we also have

[

(

· −s

)](

) =

−zs

(

) for

z ∈ S

0,ε

. 

When proving Theorem 2.5, [2, Corollary A.3] will play a crucial role, and for refer-

ence we have chosen to include (a suitable version of) it here:

Corollary 4.4 ([2, Corollary A.3]).

Let

p ≥

1 and (

)

t∈R

be a measurable process with

stationary increments and

[

]

< ∞

for all

t ∈ R

. Then (

)

t∈R

is continuous in

(

and there exist α,β > 0 such that E[|X

]

1/p

≤ α + β|t| for all t ∈ R.

Proof of Theorem 2.5.

We start by noting that if (

)

t∈R

and (

)

t∈R

are two measur-

able, stationary and integrable (

[

]

[

]

< ∞

) solutions to

(2.1)

then, for ﬁxed

s ∈ R,

= U

[0,∞)

u−v

η(dv) du (4.6)

almost surely for each

t ∈ R

, when we set

B X

−Y

. In particular, for a given

t ∈ R

we get by Lemma 4.3,

= U

(r −s) +

∞

(u−s,∞)

u−v

η(dv)x

(r −u) du (4.7)

for Lebesgue almost all

r > t −

1 and all

s ∈ Q

with

s ≤ t −

1. For any such

we observe

that the right-hand side of

(4.7)

tends to zero in

(

) as

Q 3 s → −∞

, from which we

deduce

= 0 or, equivalently,

almost surely. By Corollary 4.4 it follows that

(

)

r∈R

is continuous in

(

), and hence we get that

almost surely as well.

This shows that a solution to (2.1) is unique up to modiﬁcation.

We have

[

]

≤ a

b|u|

for any

with suitably chosen

a,b >

0 (see Corollary 4.4),

and this implies that



[0,∞)

(t −u −v)||η|(dv)du



≤ a|η|([0,∞))

(u)| du + b

|u|

[0,∞)

(t −u −v)||η|(dv) du

≤



a|η|([0,∞)) + b

[0,∞)

v |η|(dv)



(u)| du

+ b|η|([0,∞))

(|t|+ |u|)|x

(u)| du.

4 · Proofs and technical results

This is ﬁnite by Lemma 2.2 and Example 2.4, and

[0,∞)

(

t −u −v

)

(

)

therefore almost surely well-deﬁned.

To argue that

[0,∞)

(

t −u −v

)

(

)

t ∈ R

, satisﬁes

(2.1)

, let

s < t and note that by Lemma 2.2 we have

[0,∞)

u−v

η(dv) du −

[0,∞)

u−v

η(dv) du

[0,∞)

(u −v −r −w)η(dw) dr η(dv) du

[0,∞)

t−r−w

s−r−w

[0,∞)

(u −v)η(dv) du η(dw) dr

[0,∞)

(t −r −w) −x

(s −r −w)]η(dw) dr

−

[0,∞)

(t −r −w) −1

[0,∞)

(s −r −w)]η(dw) dr

[0,∞)

(t −r −w) −x

(s −r −w)]η(dw) dr

−

[0,∞)

r−w

η(dw) dr.

Next, we write

−Z

t−u

)

[0,∞)

(u −v)η(dv) du, t ∈ R, (4.8)

using that

[0,∞)

(u −v)η(dv) du =

(u) du η([0,∞)) = h(0)η([0,∞)) = −1. (4.9)

Since (Z

)

t∈R

is continuous in L

(P), one shows that the process

−n

−Z

t−u

)

[0,∞)

(u −v)η(dv) du, t ∈ R,

is stationary by approximating it by Riemann sums in

(

). Subsequently, due to

the fact that

→ X

almost surely as

n → ∞

for any

t ∈ R

, we conclude that (

)

t∈R

is stationary. This type of approximation arguments are carried out in detail in [7,

p. 20]. In case

(

)

0 for all

z ∈ C

with

(

)

≥

0, the causal representation

(2.7)

(

)

t∈R

follows from

(4.8)

and the fact that

(

) = 0 for

t <

0 by Lemma 2.2(iii). This

completes the proof. 

Paper B · Stochastic delay diﬀerential equations and related autoregressive models

Proof of Corollary 2.6. It follows from (4.9) and Corollary 2.3 that

t−u

[0,∞)

(u −v)η(dv) du

t−u

−Z

]

[0,∞)

(u −v)η(dv) du

[θ(t −u −r) −θ(t −r)][µ

(du) −δ

(du)]dL

θ(t −u −r)µ

(du) dL

θ ∗x

(t −r) dL

where we have used that µ

(R) = 0 since x

(t) →0 for t → ±∞ by (2.3). 

Proof of Theorem 3.2.

First, observe that

k=0

θ ∗φ

∗k

→ ψ

n → ∞

for some

function ψ : R → R. To see this, set ψ

k=0

θ ∗φ

∗k

, let m < n and note that

(ψ

(t) −ψ

(t))

dt =

2π





k=m+1

θ ∗φ

∗k



(y)



dy (4.10)

for m < n by Plancherel’s theorem. For any y ∈ R we have that





k=m+1

θ ∗φ

∗k



(y)



≤ |F [θ](y)|

k=m+1

|φ|((0,∞))

≤

|F [θ](y)|

1 −|φ|((0,∞))

. (4.11)

The ﬁrst inequality in

(4.11)

shows that

[

k=m+1

θ ∗φ

∗k

](

)

| →

0 as

n,m → ∞

, and

hence we can use the second inequality of

(4.11)

and dominated convergence together

with the relation

(4.10)

to deduce that (

)

n∈N

is a Cauchy sequence in

. This

establishes the existence of

. Due to the fact that

is real-valued and vanishes on

(−∞,0) for all n ∈ N, the same holds for ψ almost everywhere.

Suppose now that we have a square integrable stationary solution (

)

t∈R

. Then,

using a stochastic Fubini (e.g., [2, Theorem 3.1]), it follows that for each

t ∈ R

and

almost surely,

= X ∗φ

∗n

(t) +

n−1

k=0



θ( · −u) dL



∗φ

∗k

(t)

= X ∗φ

∗n

(t) +

n−1

(t −u) dL

(4.12)

for an arbitrary n ∈ N. By Jensen’s inequality and stationarity of (X

)

u∈R

E[X ∗φ

∗n

(t)

] ≤ E



∞

t−u

||φ

∗n

|(du)





≤ |φ

∗n

|((0,∞))E[X

Since

[

]

< ∞

and

|φ

∗n

((0

,∞

)) =

|φ|

((0

,∞

))

→

0 as

n → ∞

, we establish that

X∗φ

∗n

(

)

→

0 in

(

) as

n → ∞

. Consequently,

(4.12)

shows that

(

t−u

)

→ X

(

) as

n → ∞

. On the other hand, by the isometry property of the stochastic

integral we also have that



ψ(t −u) dL

−

(t −u) dL





= E[L

]

(ψ(u)−ψ

(u))

du → 0

References

n → ∞

, and hence

(

t−u

)

almost surely by uniqueness of limits in

(

Conversely, deﬁne a square integrable stationary process (

)

t∈R

(

t−u

)

for t ∈ R. After noting that ψ

∗φ =

n+1

k=1

θ ∗φ

∗k

= ψ

n+1

−θ for all n, we ﬁnd

0 = limsup

n→∞

∞

[ψ

n+1

(u) −θ(u) −ψ

∗φ(u)]

∞

[ψ(u)−θ(u)−ψ ∗φ(u)]

= E



−X ∗φ(t)−

θ(t −u) dL





E[L

]

−1

Thus, (X

)

t∈R

satisﬁes (3.1). 

Acknowledgments

We thank the referees for constructive and detailed reports. Their comments and sug-

gestions have helped us to improve the paper signiﬁcantly. This work was supported

by the Danish Council for Independent Research (grant DFF–4002–00003).

References

[1]

Adams, R.A. and J.J.F. Fournier (2003). Sobolev spaces. Second. Vol. 140. Pure

and Applied Mathematics (Amsterdam). Elsevier/Academic Press, Amsterdam,

xiv+305.

[2]

Barndorﬀ-Nielsen, O.E. and A. Basse-O’Connor (2011). Quasi Ornstein–Uhlen-

beck processes. Bernoulli 17(3), 916–941. doi: 10.3150/10-BEJ311.

[3]

Barndorﬀ-Nielsen, O.E., F.E. Benth and A.E.D. Veraart (2018). Ambit stochastics.

Vol. 88. Probability Theory and Stochastic Modelling. Springer, Cham. doi:

.1007/978-3-319-94129-5.

[4]

Barndorﬀ-Nielsen, O.E. and N. Shephard (2001). Non-Gaussian Ornstein–

Uhlenbeck-based models and some of their uses in ﬁnancial economics. J. R.

Stat. Soc. Ser. B Stat. Methodol. 63(2), 167–241.

[5]

Basse-O’Connor, A. and J. Rosiński (2013). Characterization of the ﬁnite varia-

tion property for a class of stationary increment inﬁnitely divisible processes.

Stochastic Process. Appl. 123(6), 1871–1890. doi:

10.1016/j.spa.2013.01.014

[6]

Basse-O’Connor, A. and J. Rosiński (2016). On inﬁnitely divisible semimartin-

gales. Probab. Theory Related Fields 164(1–2), 133–163. doi:

10.1007/s00440-0

14-0609-1.

[7]

Basse-O’Connor, A., M.S. Nielsen, J. Pedersen and V. Rohde (2018). Multivariate

stochastic delay diﬀerential equations and CAR representations of CARMA

processes. Stochastic Process. Appl. Forthcoming. doi:

10.1016/j.spa.2018.11

.011.

[8]

Brockwell, P.J. and R.A. Davis (2006). Time series: theory and methods. Springer

Series in Statistics. Reprint of the second (1991) edition. Springer, New York.

Paper B · Stochastic delay diﬀerential equations and related autoregressive models

[9]

Brockwell, P.J. and A. Lindner (2009). Existence and uniqueness of stationary

Lévy-driven CARMA processes. Stochastic Process. Appl. 119(8), 2660–2681.

doi: 10.1016/j.spa.2009.01.006.

[10]

Brockwell, P.J. and T. Marquardt (2005). Lévy-driven and fractionally inte-

grated ARMA processes with continuous time parameter. Statist. Sinica 15(2),

477–494.

[11]

Dym, H. and H.P McKean (1976). Gaussian processes, function theory, and the

inverse spectral problem. Probability and Mathematical Statistics, Vol. 31. New

York: Academic Press [Harcourt Brace Jovanovich Publishers].

[12]

Gripenberg, G., S.

O. Londen and O. Staﬀans (1990). Volterra integral and

functional equations. Vol. 34. Encyclopedia of Mathematics and its Applications.

Cambridge University Press. doi: 10.1017/CBO9780511662805.

[13]

Gushchin, A.A. and U. Küchler (2000). On stationary solutions of delay dif-

ferential equations driven by a Lévy process. Stochastic Process. Appl. 88(2),

195–211. doi: 10.1016/S0304-4149(99)00126-X.

[14]

Jones, R.H. and L.M Ackerson (1990). Serial correlation in unequally spaced

longitudinal data. Biometrika 77(4), 721–731. doi:

10.1093/biomet/77.4.721

[15]

Katznelson, Y. (2004). An introduction to harmonic analysis. Third. Cambridge

Mathematical Library. Cambridge University Press. doi:

10.1017/CBO9781139

165372.

[16]

Küchler, U. and B. Mensch (1992). Langevin’s stochastic diﬀerential equation

extended by a time-delayed term. Stochastics Stochastics Rep. 40(1-2), 23–42.

doi: 10.1080/17442509208833780.

[17]

Küchler, U. and M. Sørensen (2013). Statistical inference for discrete-time

samples from aﬃne stochastic delay diﬀerential equations. Bernoulli 19(2),

409–425. doi: 10.3150/11-BEJ411.

[18]

Marquardt, T. (2006). Fractional Lévy processes with an application to long

memory moving average processes. Bernoulli 12(6), 1099–1126.

[19]

Marquardt, T. and R. Stelzer (2007). Multivariate CARMA processes. Stochastic

Process. Appl. 117(1), 96–120. doi: 10.1016/j.spa.2006.05.014.

[20]

Nielsen, M.S. (2019). On non-stationary solutions to MSDDEs: representations

and the cointegration space. arXiv: 1903.02066.

[21]

Nielsen, M.S. and V.U. Rohde (2017). Recovering the background noise of a

Lévy-driven CARMA process using an SDDE approach. Proceedings ITISE 2017

2, 707–718.

[22]

Protter, P.E. (2004). Stochastic Integration and Diﬀerential Equations. Second.

Vol. 21. Applications of Mathematics (New York). Stochastic Modelling and

Applied Probability. Berlin: Springer-Verlag.

[23]

Rajput, B.S. and J. Rosiński (1989). Spectral representations of inﬁnitely divisi-

ble processes. Probab. Theory Related Fields 82(3), 451–487.

References

[24]

Reiß, M., M. Riedle and O. van Gaans (2006). Delay diﬀerential equations

driven by Lévy processes: stationarity and Feller properties. Stochastic Process.

Appl. 116(10), 1409–1432. doi: 10.1016/j.spa.2006.03.002.

[25]

Sato, K. (1999). Lévy Processes and Inﬁnitely Divisible Distributions. Vol. 68. Cam-

bridge Studies in Advanced Mathematics. Translated from the 1990 Japanese

original, Revised by the author. Cambridge University Press.

[26]

Schwartz, E.S. (1997). The stochastic behavior of commodity prices: Implica-

tions for valuation and hedging. J. Finance 52(3), 923–973.

[27]

Todorov, V. and G. Tauchen (2006). Simulation methods for Lévy-driven

continuous-time autoregressive moving average (CARMA) stochastic volatility

models. J. Bus. Econom. Statist. 24(4), 455–469. doi:

10.1198/07350010600000

0260.

[28]

Yaglom, A.M (1987). Correlation theory of stationary and related random functions.

Vol. I. Springer Series in Statistics. Basic results. New York: Springer-Verlag.

P a p e r

Recovering the Background Noise of a

Lévy-Driven CARMA Process Using an SDDE

Approach

Mikkel Slot Nielsen and Victor Rohde

Abstract

Based on a vast amount of literature on continuous-time ARMA processes, the

so-called CARMA processes, we exploit their relation to stochastic delay diﬀeren-

tial equations (SDDEs) and provide a simple and transparent way of estimating

the background driving noise. An estimation technique for CARMA processes,

which is particularly tailored for the SDDE speciﬁcation, is given along with an

alternative and (for the purpose) suitable state-space representation. Through a

simulation study of the celebrated CARMA(2

1) process we check the ability of

the approach to recover the distribution of the noise.

Keywords: Continuous-time ARMA process; Lévy processes; Noise estimation;

Stochastic volatility

1 Introduction

Continuous-time ARMA processes, speciﬁcally the class of CARMA processes, have

been studied extensively and found several applications. The most basic CARMA

process is the CAR(1) process, which corresponds to the Ornstein–Uhlenbeck process.

This process serves as the building block in stochastic modeling, e.g., Barndorﬀ-

Nielsen and Shephard [1] use it as the stochastic volatility component in option pric-

ing modeling and Schwartz [13] models (log) spot price of many diﬀerent commodities

through an Ornstein–Uhlenbeck speciﬁcation. More recently, several researchers have

paid attention to higher order CARMA processes. To give a few examples, Brockwell

et al. [8] model turbulent wind speed data as a CAR(2) process, García et al. [11] and

Paper C

Recovering the background noise of a Lévy-driven CARMA process using an SDDE

approach

Benth et al. [3] ﬁt a CARMA(2

1) process to electricity spot prices, and Benth et al. [4]

ﬁnd a good ﬁt of the CAR(3) to daily temperature observations (and thus, suggests a

suitable model for the OTC market for temperature derivatives). In addition, as for

the CAR(1) process, several studies have concerned the use of CARMA processes in

the modeling of stochastic volatility (see, e.g., [7, 14, 16]).

From a statistical point of view, as noted in the above references, the ability to

recover the underlying noise of the CARMA process is important. However, while it is

possible to recover the driving noise process, it is a subtle task. Due to the non-trivial

nature of the typical algorithm, see [7], implementation is not straightforward and

approximation errors may be diﬃcult to locate. The recent study of Basse-O’Connor et

al. [2] on processes of ARMA structure relates CARMA processes to certain stochastic

(delay) diﬀerential equations, and this leads to an alternative way of backing out the

noise from the observed process which is transparent and easy to implement. The

contribution of this paper is exploiting this result to get a simple way to recover the

driving noise. The study both relies on and supports the related work of Brockwell

et al. [7].

Section 2 recalls a few central deﬁnitions and gives a dynamic interpretation

of CARMA processes by relating them to solutions of stochastic delay diﬀerential

equations. Section 3 brieﬂy discusses how to do (consistent) estimation and inference

in the dynamic model and, ﬁnally, in Section 4 we investigate through a simulation

study the ability of the approach to recover the distribution of the underlying noise

for two sample frequencies.

2 CARMA processes and their dynamic SDDE representation

Recall that a Lévy process is interpreted as the continuous-time analogue to the

(discrete-time) random walk. More precisely, a (one-sided) Lévy process (

)

t≥0

= 0, is a stochastic process having stationary independent increments and càdlàg

sample paths. From these properties it follows that the distribution of

is inﬁnitely

divisible, and the distribution of (

)

t≥0

is determined by the one of

according to

the relation

logE[e

iyL

] = tlogE[e

iyL

]

for

y ∈ R

and

t ≥

0. The deﬁnition is extended to a two-sided Lévy process (

)

t∈R

= 0, which can be constructed from a one-sided Lévy process (

)

t≥0

by taking

an independent copy (

)

t≥0

and setting

t ≥

0 and

−L

(−t)−

t <

Throughout, (

)

t∈R

denotes a two-sided Lévy process, which is assumed to be square

integrable.

Next,we will give a brief recap of Lévy-driven CARMA processes. (For an extensive

treatment, see [5, 7, 9].) Let p ∈ N and set

P (z) = z

+ a

p−1

+ ···+ a

and Q(z) = b

+ b

z + ···+ b

p−1

(2.1)

2 · CARMA processes and their dynamic SDDE representation

for z ∈ C and a

,... , a

,... , b

p−1

∈ R. Deﬁne







0 1 0 ··· 0

0 0 1 ··· 0

0 0 0 ··· 1

−a

p−1

−a

p−2

··· −a







= [0

,... ,

∈ R

, and

= [

,... , b

p−2

p−1

]

. In order to ensure the ex-

istence of a casual CARMA process we will assume that the eigenvalues of

or,

equivalently, the zeroes of

all have negative real parts. Then there is a unique

(strictly) stationary R

-valued process (X

)

t∈R

satisfying

dt + e

, (2.2)

and it is explicitly given by

−∞

(t−u)

for

t ∈ R

. For a given

q ∈ N

with

q < p

, we set

= 1 and

= 0 for

q < j < p

. A CARMA(

p, q

) process (

)

t∈R

is then the

strictly stationary process deﬁned by

= b

, t ∈ R. (2.3)

This is the state-space representation of the formal stochastic diﬀerential equation

P (D)Y

= Q(D)DL

, t ∈ R, (2.4)

where D denotes diﬀerentiation with respect to time. One says that (Y

)

t∈R

is causal,

since

is independent of (

−L

)

s>t

for all

t ∈ R

. We will say that (

)

t∈R

is invertible

if all the zeroes of

have negative real parts. The word “invertible” is justiﬁed by

Theorem 2.1 below and the fact that this is the assumption imposed in [7] in order

to make the recovery of the increments of the Lévy process possible. In Figure 1 we

have simulated a CARMA(2

1) process driven by a gamma (Lévy) process and by a

Brownian motion, respectively.

Figure 1:

A simulation of a CARMA(2

1) process with parameters

= 1

3619,

= 0

0443, and

= 0

2061.

It is driven by a gamma (Lévy) process with parameters

= 0

2488 and

= 0

5792 on the left and a

Brownian motion with mean µ = 0.1441 and standard deviation σ = 0.2889 on the right.

For a given ﬁnite (signed) measure

concentrated on [0

,∞

) we will adopt a

deﬁnition from [2] and say that an integrable measurable process (

)

t∈R

is a solution

to the associated Lévy-driven stochastic delay diﬀerential equation (SDDE) if it is

stationary and satisﬁes

[0,∞)

t−u

η(du) dt + dL

, t ∈ R. (2.5)

Paper C

Recovering the background noise of a Lévy-driven CARMA process using an SDDE

approach

In the formulation of the next result we denote by

the Dirac measure at 0 and use

the convention

∅

= 1 and

∅

= 0. Furthermore, we introduce the ﬁnite measure

(

) =

[0,∞)

(

)

βt

for

β ∈ C

with

(

)

0, and let

and

j−1

∗η

for

= 1

,... , p −

1. By relying on [2, Corollary 3.12] we get the following dynamic SDDE

representation of an invertible CARMA(p,p −1) process:

Theorem 2.1.

Let (

)

t∈R

be an invertible CARMA(

p, p −

1) process and let

,... , β

p−1

be the roots of

. Then (

)

t∈R

is the (up to modiﬁcation) unique stationary solution to

(2.5) with the real-valued measure η given by

η =

p−1

j=0

, (2.6)

where α

,... , α

p−1

∈ C are chosen such that the relation

P (z) = z

p−1

k=1

(z −β

) −

p−1

j=0

p−1

k=j+1

(z −β

) (2.7)

holds for all z ∈ C. In particular, if β

,... , β

p−1

are distinct,

η(dt) = γ

(dt) +



[0,∞)

(t)

p−1

i=1



dt (2.8)

where

= −



p−1

j=1



and γ

= −

P (β

)

(β

)

for i = 1,...,p −1.

Proof.

It follows immediately from [2, Corollary 3.12] that (

)

t∈R

is the unique

stationary solution to

(2.5)

with

given by

(2.6)

. Assume now that the roots of

are distinct. Then relation

(2.7)

implies in particular that

−

(

p−1

j=1

Moreover, an induction argument shows that

(dt) = 1

[0,∞)

(t)

i=1

k=1,k,i

(β

−β

)

−1

dt,

from which it follows that

η(dt) −α

(dt) =

p−1

j=1



[0,∞)

(t)

i=1

k=1,k,i

(β

−β

)

−1



= 1

[0,∞)

(t)

p−1

i=1

p−1

j=i

k=1,k,i

(β

−β

)

−1

dt.

Finally, observe that the deﬁnition of α

,α

,... , α

p−1

implies that

p−1

j=i

p−1

k=j+1

(β

−β

)

p−1

k=1,k,i

(β

−β

)

p−1

j=i

k=1,k,i

(β

−β

)

−1

, i = 1,. ..,p −1,

which concludes the proof. 

2 · CARMA processes and their dynamic SDDE representation

Remark 2.2.

In Brockwell et al. [7] they assume that the roots of

are distinct.

This makes it possible to write (

)

t∈R

as a sum of dependent Ornstein–Uhlenbeck

processes, which they in turn use to recover the driving Lévy process. In Theorem 2.1

above we invert the CARMA process by using that it is a solution to an SDDE and

thereby circumvent the assumption of distinct roots. On the other hand, when

q ≥

the roots of

may complex-valued and this would make an estimation procedure that

is parametrized by these roots (such as the one given in Section 3) more complicated

in practice.

Theorem 2.1 gives an insightful intuition about inverting CARMA processes as

well. Let

be the Fourier transform where

[

](

) =

−ixy

(

)

for

f ∈ L

. If we

then heuristically take this Fourier transform on both sides of (2.4) we get

P (iy)F [Y ](y) = Q(iy)F [DL](y), y ∈ R.

For γ

∈ R, this can be rewritten as

F [DL](y) =



P (iy)−(iy −γ

)Q(iy)

Q(iy)

−γ



F [Y ](y) + F [DY ](y), y ∈ R.

If we let γ

= −



p−1

j=1



then

y 7−→

P (iy)−(iy −γ

)Q(iy)

Q(iy)

∈ L

and consequently, there exists f ∈ L

such that



P (iy)−(iy −γ

)Q(iy)

Q(iy)

−γ



F [Y ](y) = F [−f ∗Y −γ

Y ](y), y ∈ R.

We conclude that (Y

)

t∈R

satisﬁes the formal equation

= f ∗Y

+ γ

+ DL

, y ∈ R.

By integrating this equation we get an equation of the form

(2.5)

, and in the case

where

has distinct roots, contour integration and Cauchy’s residue theorem imply

that

f (t) = −1

[0,∞)

(t)

p−1

i=1

P (β

)

(β

)

in line with Theorem 2.1.

The simplest example beyond the Ornstein–Uhlenbeck process is the invertible

CARMA(2,1) process:

Example 2.3.

Suppose that

∈ R

are chosen such that the zeroes of

(

) =

have negative real parts and let

0 so that the same holds for

(

) =

. Then there exists an associated invertible CARMA(2

1) process (

)

t∈R

, and

Theorem 2.1 implies that

= α

dt + α

∞

t−u

du dt + dL

, t ∈ R,

where

−b

−a

, and

= (

−b

)

−a

. Note that, in this particular

case, we have γ

= α

and γ

= α

Paper C

Recovering the background noise of a Lévy-driven CARMA process using an SDDE

approach

We end this section by giving the mean and the autocovariance function of the

invertible CARMA(

p, p −

1) process. To formulate the result we introduce the

p ×p

matrix







0 0 ··· 0 1

1 β

0 ··· 0 0

0 1 β

··· 0 0

0 0 ··· 1 β

p−1

··· α

p−2

p−1







, (2.9)

where

,α

,β

,... , α

p−1

,β

p−1

∈ C

are given as in Theorem 2.1. In case

= 1, respec-

tively p = 2, the matrix in (2.9) reduces to A

= α

, respectively

Proposition 2.4.

Let (

)

t∈R

be an invertible CARMA(

p, p −

1) process and let

be the

associated measure introduced in Theorem 2.1. Then

E[Y

] = −

η([0,∞))

and γ(h) B Cov(Y

) = σ

|h|

Σe

for h ∈ R,

where

µ = E[L

], σ

= Var(L

), and Σ =

∞

dy.

In particular, (Y

)

t∈R

is centered if and only if (L

)

t∈R

is centered.

Proof.

The mean of

is obtained from

(2.5)

using the stationarity of (

)

t∈R

. The

autocovariance of (Y

)

t∈R

function is given in [2, p. 14]. 

3 Estimation of the SDDE parameters

Fix

∆ >

0 and

n ∈ N

, and suppose that we have

+ 1 equidistant observations

∆

2∆

,... , Y

n∆

of an invertible CARMA(

p, p −

1) process (

)

t∈R

. Our interest will

be on estimating the vector of parameters

= [α

,α

,β

,α

,β

,... , α

p−1

,β

p−1

]

of η in (2.6). We will restrict our attention to the case where θ

∈ R

2p−1

. For simplic-

ity we will also assume that (

)

t∈R

or, equivalently, (

)

t∈R

is centered. For any

given

let

k−1

(

k∆

;

) be the

(

) projection of

k∆

onto the linear span of

∆

2∆

,... , Y

(k−1)∆

and set



(

) =

k∆

−π

k−1

(

k∆

;

). Then the least squares es-

timator

of θ

is the point that minimizes

θ 7−→

k=1



(θ)

In practice, the projections

k−1

(

k∆

;

= 1

,... , n

, can be computed using the

Kalman recursions (see, e.g., [6, Proposition 12.2.2]) together with the state-space

4 · A simulation study, p = 2

representation given in Proposition 3.1 below. We stress that one can compute the

projections without a state-space representation, e.g., using the Durbin–Levinson

algorithm (see [6, p. 169]), but this approach will be very time-consuming for large

and a cut-oﬀ is necessary in practice. (This technique is used by [12] in the SDDE

framework

(2.5)

when

is compactly supported and (

)

t∈R

is a Brownian motion.)

Under weak regularity assumptions, following the arguments in [7, Proposition 4-5]

that rely on [10], one can show that the estimator

is (strongly) consistent and

asymptotically normal.

Proposition 3.1 provides a convenient state-space representation of (

k∆

)

k∈N

terms of

,α

,β

,... , α

p−1

,β

p−1

(rather than the one from the deﬁnition of (

)

t∈R

terms of the coeﬃcients of P and Q).

Proposition 3.1.

Let the setup be as above and let

be the matrix given in

(2.9)

. Then

(

k∆

)

k∈N

has the state-space representation

k∆

k ∈ N

, with (

)

k∈N

satisfying

the state-equation

= e

∆

k−1

+ U

, k ∈ N,

where (

)

k∈N

is a sequence of i.i.d. random vectors with mean 0 and covariance matrix

∆

du.

Proof.

From [2, Proposition 3.13] we have that

t ∈ R

, where (

)

t∈R

is the

-valued Ornstein–Uhlenbeck process given by

−∞

(t−u)

, t ∈ R.

Thus, by deﬁning Z

k∆

so that Y

k∆

= e

for k ∈ N

, and observing that

(k−1)∆

−∞

(k∆−u)

k∆

(k−1)∆

(k∆−u)

= e

∆

k−1

+ U

with U

k∆

(k−1)∆

(k∆−u)

for k ∈ N, the result follows immediately. 

4 A simulation study, p = 2

The simulation of the invertible CARMA(2

1) is done in a straightforward manner

by the (deﬁning) state-space representation of (

)

t∈R

and an Euler discretization

(2.2)

. In order to ensure that

is a realization of the stationary distribution we

take

20000

steps of size 0.01 before time 0. Given

the simulation is based on

200000

steps each of size 0.01, and then it is assumed that we have

+ 1 =

2000

respectively

+ 1 =

20000

, observations of the process

∆

2∆

,... , Y

(n−1)∆

on a

grid with distance

∆

= 1, respectively

∆

= 0

1, between adjacent points. We will be

considering the case where the background noise (

)

t∈R

is a gamma (Lévy) process

with shape parameter

λ >

0 and scale parameter

ξ >

0. Recall that the gamma process

with parameters

and

is a pure jump process with inﬁnite activity, and the density

f (at time 1) is given by

f (x) =

Γ (λ)ξ

λ−1

−

, x > 0,

Paper C

Recovering the background noise of a Lévy-driven CARMA process using an SDDE

approach

where

is the gamma function. In line with [7] we will choose the parameters to be

2488 and

= 0

5792. For comparison we will also study the situation where (

)

t∈R

is a Brownian motion with mean

λξ

= 0

1441 and standard deviation

√

2889 (these parameters are chosen so that the Brownian motion matches the mean

and standard deviation of the gamma process). After subtracting the sample mean

−1

n−1

k=0

k∆

from the observations,the vector of true parameters

= [

,α

,β

]

is estimated as outlined in Section 3. We will choose θ

= [−1.1558,0.1939,−0.2061]

as in [7] (this choice corresponds to

= 1

3619,

= 0

0443, and

= 0

2061, which

are certain estimated values of a stochastic volatility model by [15]). We repeat the

experiment 100 times and the estimated parameters are given in Table 1.

Table 1:

Estimated SDDE parameters based on

100

simulations of the CARMA(2

1) process on [0

,2000

]

with true parameters α

= −1.1558, α

= 0.1939, and β

= −0.2061.

Noise Spacing Parameter Mean Bias Std

Gamma

∆ = 1

−1.2075 −0.0517 0.1155

0.2157 0.0218 0.0501

−0.2190 −0.0129 0.0366

∆ = 0.1

−1.1688 −0.0130 0.0466

0.1934 −0.0005 0.0315

−0.2053 0.0008 0.0296

Gaussian

∆ = 1

−1.1967 −0.0409 0.1147

0.2117 0.0178 0.0524

−0.2201 −0.0140 0.0358

∆ = 0.1

−1.1653 −0.0095 0.0469

0.2002 0.0062 0.0353

−0.2121 −0.0060 0.0324

It appears that the (absolute value of the) bias of [

,α

,β

] is very small when

∆

= 0

1. The general picture is that the bias is largest for

, and it is also consistently

negative. This observations should, however, be seen in light of the relative size of

compared to α

and β

Once we have estimated

we can estimate the driving Lévy process by exploiting

the relation presented in Theorem 2.1 and using the trapezoidal rule. Note that, as in

the estimation, we use the relation in Theorem 2.1 on the demeaned data so that we

in turn recover the centered version of the Lévy process. Finally, to obtain an estimate

of the true Lévy process we estimate

[

] using Proposition 2.4. In order to

get a proper approximation of the integral

∞

(

t−s

−E

[

])

we will only be

estimating

k∆

−L

(k−1)∆

for

m B

∆

−1

≤ k ≤ n

. If one is interested in estimating the

entire path

(m+1)∆

−L

m∆

(m+2)∆

−L

m∆

,... , L

n∆

−L

m∆

, one will need data observed at a

high frequency, that is, small

∆

, since the approximation errors accumulate over time.

Typically, one is more interested in estimating the distribution of

, which is less

sensitive to these approximation errors, and this is our focus in the following. For this

reason, we have in Figure 2 plotted ﬁve estimations of the distribution function of

in dashed lines against the true distribution function (represented by a solid line) in

the low frequency case (

∆

= 1). The left, respectively right, ﬁgure corresponds to the

5 · Conclusion and further research

gamma, respectively Gaussian, case. Due to the above conventions, each estimated

distribution function is based on

1950

estimated realizations of

. Generally, the

estimated distribution functions in the ﬁgures seem to capture the true structure

and give a fairly precise estimate, however, there is a slight tendency to over-estimate

small values and under-estimate large values.

Due to the high degree of precision of the estimated distribution functions, we

plot an associated histogram, based on

1950

realizations of

and a sampling fre-

quency of

∆

= 1, against the theoretical probability density function in order to detect

potential (smaller) biases. We compare this to a histogram of the actual increments.

For simplicity, we have restricted ourselves to the Gaussian case as the gamma case

is diﬃcult to analyze close to zero (speciﬁcally, this will require more observations).

The plots are found in Figure 3. We see that the two histograms have very similar

appearances, but the histogram based on estimated parameters has a slightly smaller

mean.

5 Conclusion and further research

In this paper we have studied the ability to recover the underlying Lévy process

from an observed invertible CARMA process using the SDDE relation presented in

Theorem 2.1. In particular, after discussing the theoretical foundations, we did a

simulation study similar to the one in the classical approach presented in [7] and

estimated the underlying Lévy noise. Our ﬁndings supported the theory and it seemed

possible to (visually) detect the distribution of the underlying Lévy process.

Future research could include a further study of the performance of the presented

SDDE inversion technique compared to the classical approach in [7]. Speciﬁcally, in

light of Remark 2.2, a suggestion could be to consider a situation where

has a root

of multiplicity strictly greater than one or where

q ≥

2 and some of the roots of

are not real numbers. Such situations may complicate the analysis in one approach

relative to the other. Furthermore, it may be interesting to study inversion formulas

for invertible CARMA(

p, q

) processes when

p > q

+ 1. In particular, a manipulation of

the equation (2.4) yields

P (D)

Q(D)

dt, t ∈ R. (5.1)

The content of Theorem 2.1 is that the right-hand side of

(5.1)

is meaningful when

+ 1 and it should be interpreted as

−

[0,∞)

t−s

(

)

. It seems that this

statement continues to hold when

p > q

+ 1 as well when

is replaced by a suitable

linear combination of dY

,d(DY )

,... , d(D

p−q−1

Y )

Paper C

Recovering the background noise of a Lévy-driven CARMA process using an SDDE

approach

Figure 2:

Five estimations of the distribution function of

, based on estimates of

, and

, plotted

against the true distribution function for a sampling frequency of

∆

= 1. The left corresponds to gamma

noise and the right to Gaussian noise.

Figure 3:

Histograms of the true increments on the left and estimated increments, based on estimates of

, and

for a sampling frequency of

∆

= 1, on the right plotted against the theoretical (Gaussian)

probability density function.

Acknowledgments

The research was supported by the Danish Council for Independent Research (grant

DFF–4002–00003).

References

[1]

Barndorﬀ-Nielsen, O.E. and N. Shephard (2001). Non-Gaussian Ornstein–

Uhlenbeck-based models and some of their uses in ﬁnancial economics. J. R.

Stat. Soc. Ser. B Stat. Methodol. 63(2), 167–241.

[2]

Basse-O’Connor, A., M.S. Nielsen, J. Pedersen and V. Rohde (2017). A continu-

ous-time framework for ARMA processes. arXiv: 1704.08574v1.

[3]

Benth, F.E., C. Klüppelberg, G. Müller and L. Vos (2014). Futures pricing in

electricity markets based on stable CARMA spot models. Energy Econ. 44, 392–

406.

References

[4]

Benth, F.E., J. Šaltyt

e-Benth and S. Koekebakker (2007). Putting a price on

temperature. Scand. J. Statist. 34(4), 746–767. doi:

10.1111/j.1467-9469.200

7.00564.x.

[5]

Brockwell, P.J. (2001). Lévy-driven CARMA processes. Ann. Inst. Statist. Math.

53(1). Nonlinear non-Gaussian models and related ﬁltering methods (Tokyo,

2000), 113–124. doi: 10.1023/A:1017972605872.

[6]

Brockwell, P.J. and R.A. Davis (2006). Time series: theory and methods. Springer

Series in Statistics. Reprint of the second (1991) edition. Springer, New York.

[7]

Brockwell, P.J., R.A. Davis and Y. Yang (2011). Estimation for non-negative

Lévy-driven CARMA processes. J. Bus. Econom. Statist. 29(2), 250–259. doi:

10.1198/jbes.2010.08165.

[8]

Brockwell, P.J., V. Ferrazzano and C. Klüppelberg (2013). High-frequency

sampling and kernel estimation for continuous-time moving average processes.

J. Time Series Anal. 34(3), 385–404. doi: 10.1111/jtsa.12022.

[9]

Brockwell, P.J. and A. Lindner (2009). Existence and uniqueness of stationary

Lévy-driven CARMA processes. Stochastic Process. Appl. 119(8), 2660–2681.

doi: 10.1016/j.spa.2009.01.006.

[10]

Francq, C. and J.

M. Zakoïan (1998). Estimating linear representations of

nonlinear processes. J. Statist. Plann. Inference 68(1), 145–165.

[11]

García, I., C. Klüppelberg and G. Müller (2011). Estimation of stable CARMA

models with an application to electricity spot prices. Stat. Model. 11(5), 447–

470.

[12]

Küchler, U. and M. Sørensen (2013). Statistical inference for discrete-time

samples from aﬃne stochastic delay diﬀerential equations. Bernoulli 19(2),

409–425. doi: 10.3150/11-BEJ411.

[13]

Schwartz, E.S. (1997). The stochastic behavior of commodity prices: Implica-

tions for valuation and hedging. J. Finance 52(3), 923–973.

[14]

Todorov, V. (2009). Estimation of continuous-time stochastic volatility models

with jumps using high-frequency data. J. Econometrics 148(2), 131–148.

[15]

Todorov, V. (2011). Econometric analysis of jump-driven stochastic volatility

models. J. Econometrics 160(1), 12–21.

[16]

Todorov, V. and G. Tauchen (2006). Simulation methods for Lévy-driven

continuous-time autoregressive moving average (CARMA) stochastic volatility

models. J. Bus. Econom. Statist. 24(4), 455–469. doi:

10.1198/07350010600000

0260.

P a p e r

Multivariate Stochastic Delay Diﬀerential

Equations and CAR Representations of

CARMA Processes

Andreas Basse-O’Connor, Mikkel Slot Nielsen, Jan Pedersen and Victor Rohde

Abstract

In this study we show how to represent a continuous time autoregressive mov-

ing average (CARMA) as a higher order stochastic delay diﬀerential equation,

which may be thought of as a CAR(

∞

) representation. Furthermore, we show

how the CAR(

∞

) representation gives rise to a prediction formula for CARMA

processes. To be used in the above mentioned results we develop a general theory

for multivariate stochastic delay diﬀerential equations, which will be of indepen-

dent interest, and which will have particular focus on existence, uniqueness and

representations.

MSC: 60G05; 60G22; 60G51; 60H05; 60H10

Keywords: CARMA processes; FICARMA processes; Long memory; MCARMA processes;

Multivariate Ornstein–Uhlenbeck processes; Multivariate stochastic delay diﬀerential equa-

tions; Noise recovery; Prediction

1 Introduction and main ideas

The class of autoregressive moving average (ARMA) processes is one of the most

popular classes of stochastic processes for modeling time series in discrete time. This

class goes back to the thesis of Whittle in 1951 and was popularized in Box and

Jenkins [5]. The continuous time analogue of an ARMA process is called a CARMA

process, and it is the formal solution (X

)

t∈R

to the equation

P (D)X

= Q(D)DZ

, t ∈ R, (1.1)

Paper D · Multivariate stochastic delay diﬀerential equations and CAR representations of

CARMA processes

where

and

are polynomials of degree

and

, respectively. Furthermore,

denotes diﬀerentiation with respect to

, and (

)

t∈R

is a Lévy process, the continuous

time analogue of a random walk. In the following we will assume that

p > q

and

(

)

(

)

0 whenever

(

)

≥

0. Under this assumption, (

)

t∈R

can be rigorously

deﬁned through a state-space representation as long as (

)

t∈R

has log moments. Lévy-

driven CARMA processes have found many applications, for example, in modeling

temperature, electricity and stochastic volatility, cf. [4, 14, 27]. Moreover, there exists

a vast amount of literature on theoretical results for CARMA processes (and variations

of these), and a few references are [6, 7, 8, 10, 18, 19, 26].

It is well-known that any causal CARMA process has a continuous time moving

average representation of CMA(∞) type

−∞

g(t −u) dZ

, t ∈ R,

see the references above or Section 4.3. This representation may be very convenient

for studying many of their properties. A main contribution of our work is that we

obtain a CAR(

∞

) representation of CARMA processes and, by the arguments below,

it will take the form

R(D)X

∞

t−u

f (u) du + DZ

, t ∈ R, (1.2)

where

is a polynomial of order

p − q

and

f : R → R

is a deterministic function,

both deﬁned through

and

. Since (

)

t∈R

p − q −

1 times diﬀerentiable, see

[19, Proposition 3.32], the relation

(1.2)

is well-deﬁned if we integrate both sides. A

heuristic argument for why

(1.2)

is a reasonable continuous time equivalent of the

discrete time AR(

∞

) representation is as follows. If

= 0,

is constant and

(1.2)

holds with

and

= 0. If

q ≥

1, it is convenient to rephrase

(1.1)

in the frequency

domain (that is, apply the Fourier transform

on both sides of the equation and

rearrange terms):

P (iy)

Q(iy)

F [X](y) = F [DZ](y), y ∈R. (1.3)

Using polynomial long division we may choose a polynomial

of order

p −q

such

that

S(z) B Q(z)R(z) −P (z), z ∈ C,

is a polynomial of at most order q −1. Now observe that

P (iy)

Q(iy)

F [X](y) =



R(iy) −

S(iy)

Q(iy)



F [X](y)

= F [R(D)X](y)−F [f ](y)F [X](y),

where

f : R → R

is the

function characterized by the relation

[

](

) =

(

)

(

)

for

y ∈ R

. (In fact, we even know that

is vanishing on (

−∞,

0) and decays exponen-

tially fast at

∞

, cf. Remark 4.10.) Combining this identity with

(1.3)

results in the

representation (1.2).

We show in Theorem 4.8 that

(1.2)

does indeed hold true for any invertible

(Lévy-driven) CARMA process. Similar relations are shown to hold for invertible

fractionally integrated CARMA (FICARMA) processes, where (

)

t∈R

is a fractional

2 · Notation

Lévy process, and also for their multi-dimensional counterparts, which we will refer

to as MCARMA and MFICARMA processes, respectively. We use these representations

to obtain a prediction formula for general CARMA type processes (see Corollary 4.11).

A prediction formula for invertible one-dimensional Lévy-driven CARMA processes

is given in [9, Theorem 2.7], but prediction formulas for MCARMA processes have,

to the best of our knowledge, not been studied in the literature.

Autoregressive representations such as

(1.2)

are useful for several reasons. To give

a few examples, they separate the noise (

)

t∈R

from (

)

t∈R

and hence provide a

recipe for recovering increments of the noise from the observed process, they ease the

task of prediction (and thus estimation), and they clarify the dynamic behavior of the

process. These facts motivate the idea of deﬁning a broad class of processes, including

the CARMA type processes above, which all admit an autoregressive representation,

and it turns out that a well-suited class to study is the one formed by solutions to

multi-dimensional stochastic delay diﬀerential equations (MSDDEs). To be precise,

for an integrable

-dimensional (measurable) process

= [

,... , Z

]

t ∈ R

, with

stationary increments and a ﬁnite signed

n×n

matrix-valued measure

, concentrated

on [0

,∞

), a stationary process

= [

,... , X

]

t ∈ R

, is a solution to the associated

MSDDE if it satisﬁes

= η ∗X(t) dt + dZ

, t ∈ R. (1.4)

By equation (1.4) we mean that

−X

k=1

[0,∞)

u−v

(dv) du + Z

−Z

, j = 1,..., n, (1.5)

almost surely for each

s < t

. This system of equations is an extension of the stochastic

delay diﬀerential equation (SDDE) in [3, Section 2] to the multivariate case. The

overall structure of

(1.4)

is also in line with earlier literature such as [16, 20] on uni-

variate SDDEs, but here we allow for inﬁnite delay (

is allowed to have unbounded

support) which is a key property in order to include the CARMA type processes in

the framework.

The structure of the paper is as follows: In Section 2 we introduce the notation

used throughout this paper. Next, in Section 3, we develop the general theory for

MSDDEs with particular focus on existence, uniqueness and prediction. The general

results of Section 3 are then specialized in Section 4 to various settings. Speciﬁcally,

in Section 4.1 we consider the case where the noise process gives rise to a reasonable

integral, and in Section 4.2 we demonstrate how to derive results for higher order

SDDEs by nesting them into MSDDEs. Finally, in Section 4.3 we use the above

mentioned ﬁndings to represent CARMA processes and generalizations thereof as

solutions to higher order SDDEs and to obtain the corresponding prediction formulas.

2 Notation

Let

R → C

m×k

be a measurable function and

k × n

(non-negative) matrix

measure, that is,

µ =







··· µ







Paper D · Multivariate stochastic delay diﬀerential equations and CAR representations of

CARMA processes

where each µ

is a measure on R. Then, we will write f ∈L

(µ) if

(t)|

(dt) < ∞

for l = 1,...,k, i = 1,. ..,m and j = 1,...,n. Provided that f ∈L

(µ), we set

f (t)µ(dt) =

l=1







(t)µ

(dt) ···

(t)µ

(dt)

(t)µ

(dt) ···

(t)µ

(dt)







. (2.1)

is the Lebesgue measure, we will suppress the dependence on the measure and

write

f ∈ L

, and in case

is measurable and bounded Lebesgue almost everywhere,

f ∈ L

∞

. For two (matrix) measures

and

−

, where at least one of them is

ﬁnite, we call the set function

(

)

B µ

(

)

−µ

−

(

), deﬁned for any Borel set

, a

signed measure (and, from this point, simply referred to as a measure). We may and

do assume that the two measures

and

−

are singular. To the measure

we will

associate its variation measure

|µ| B µ

−

, and when

|µ|

(

)

< ∞

, we will say that

is ﬁnite. Integrals with respect to

are deﬁned in a natural way from

(2.1)

whenever

f ∈ L

(

)

B L

(

|µ|

). If

is one-dimensional, respectively if

is one-dimensional, we

will write

f ∈ L

(

) if

f ∈ L

(

|µ

) for all

= 1

,... , k

and

= 1

,... , n

, respectively if

∈ L

(

|µ|

) for all

= 1

,... , m

and

= 1

,... , k

. The associated integral is deﬁned in an

obvious manner.

We deﬁne the convolution at a given point t ∈ R by

f ∗µ(t) =

f (t −u)µ(du)

provided that

(

t − ·

)

∈ L

(

). In case that

is the Lebesgue–Stieltjes measure of a

function

g : R → R

k×n

we will also write

f ∗g

(

) instead of

f ∗µ

(

) (not to be confused

with the standard convolution between functions). For a given measure µ we set

D(µ) =

z ∈ C :

−Re(z)t

|µ

|(dt) < ∞ for i = 1,...,k and j = 1,...,n

and deﬁne its Laplace transform L[µ] as

L[µ]

(z) =

−zt

(dt) for z ∈ D(µ), i = 1,...,k and j = 1,...,n.

is a ﬁnite measure, we will also refer to the Fourier transform

[

] of

, which is

given as

[

](

) =

[

](

) for

y ∈ R

. If

(

) =

(

)

for some measurable function

, we write

[

] and

[

] instead. We will also use that the Fourier transform

extends from

∪L

, and it maps

onto

. We will say that

has a moment

of order p ∈ N

|t|

|µ

|(dt) < ∞ for all j,k = 1,...,n.

Finally, for two functions

f ,g : R → C

and

a ∈

[

−∞,∞

], we write

(

) =

(

)),

(

)

∼

g(t) and f (t) = O(g(t)) as t → a if

lim

t→a

f (t)

g(t)

= 0, lim

t→a

f (t)

g(t)

= 1 and limsup

t→a



f (t)

g(t)



< ∞,

respectively.

3 · Stochastic delay diﬀerential equations

3 Stochastic delay diﬀerential equations

Consider the general MSDDE in

(1.4)

, where the noise (

)

t∈R

is a measurable process,

which is integrable and has stationary increments, and the delay measure

is a ﬁnite

(signed)

n ×n

matrix-valued measure concentrated on [0

,∞

). The ﬁrst main result

provides suﬃcient conditions to ensure existence and uniqueness of a solution. To

obtain such results we need to put assumptions on the delay measure

. In order to

do so, we associate to η the function h : D(η) → C

n×n

given by

h(z) = I

z −L[η](z), z ∈ D(µ), (3.1)

where I

is the n ×n identity matrix.

Theorem 3.1.

Let

be given as in

(3.1)

and suppose that

det

(

))

0 for all

y ∈ R

Suppose further that

has second moment. Then there exists a function

g : R → R

n×n

characterized by

F [g](y) = h(iy)

−1

, y ∈ R, (3.2)

the convolution

g ∗Z(t) B Z

g ∗η(t −u)Z

du (3.3)

is well-deﬁned for each

t ∈ R

almost surely, and

g ∗Z

(

t ∈ R

, is the unique (up to

modiﬁcation) stationary and integrable solution to

(1.4)

. If, in addition to the above stated

assumptions,

det

(

))

0 for all

z ∈ C

with

(

)

≥

0 then the solution in

(3.3)

is casual

in the sense that (X

)

t∈R

is adapted to the ﬁltration {σ(Z

−Z

: s < t)}

t∈R

As discussed in Section 4.1, the solution (

)

t∈R

(1.4)

will very often take form as a

)

t∈R

-driven moving average, that is,

g(t −u) dZ

, t ∈ R. (3.4)

This fact justiﬁes the notation

g ∗Z

introduced in

(3.3)

. In case

= 1, equation

(1.4)

reduces to the usual ﬁrst order SDDE, and then the existence condition becomes

h(iy) = iy −F [η](y) , 0 for all y ∈R, and the kernel driving the solution is character-

ized by

[

](

) = 1

(

). This is consistent with earlier literature (cf. [16, 20]). The

second main result concerns prediction of MSDDEs. In particular, the content of the

result is that we can compute a prediction of future values of the observed process if

we are able to compute the same type of prediction of the noise.

Theorem 3.2.

Suppose that

det

(

))

0 for all

z ∈ C

with

(

)

≥

0 and that

has

second moment. Furthermore, let (

)

t∈R

be the stationary and integrable solution to

(1.4)

and let g be given by (3.2). Fix s < t. Then, if we set

= E[Z

−Z

| Z

−Z

, r < s], u > s, (3.5)

it holds that

E[X

| X

, u ≤ s]

= g(t −s)X

g(t −u)η ∗

(−∞,s]

(u) du + g ∗

(s,∞)

(t),

Paper D · Multivariate stochastic delay diﬀerential equations and CAR representations of

CARMA processes

using the notation

η ∗{1

(−∞,s]

X}(u)

k=1

[u−s,∞)

u−v

(dv)

and

g ∗{1

(s,∞)

Z}(u)

k=1

[0,u−s)

u−v

(dv)

for u > s and j = 1,...,n.

Remark 3.3.

In case (

)

t∈R

is a Lévy process, the prediction formula in Theorem 3.2

simpliﬁes, since

= (u −s)E[Z

] and thus

E[X

| X

, u ≤ s]

= g(t −s)X

g(t −u)η ∗

(−∞,s]

(u) du +

g(t −u) du E[Z

using integration by parts. Obviously, the formula takes an even simpler form if

[

] = 0. If instead we are in a long memory setting and (

)

t∈R

is a fractional

Brownian motion, we can rely on [15] to obtain (

)

s<u≤t

and then use the formula

given in Theorem 3.2 to compute the prediction E[X

| X

, u ≤ s].

In Section 4.3 we use this prediction formula combined with the relation between

MSDDEs and MCARMA processes to obtain a prediction formula for any invertible

MCARMA process.

4 Examples and further results

In this section we will consider several examples of MSDDEs and give some additional

results. We begin by deﬁning what we mean by a regular integrator, since this makes it

possible to have the compact form

(3.4)

of the solution to

(1.4)

in most cases. Next, we

show how one can nest higher order MSDDEs in the (ﬁrst order) MSDDE framework.

Finally, we show that invertible MCARMA processes (and some generalizations) form

a particular subclass of solutions to higher order MSDDEs.

4.1 Regular integrators and moving average representations

When considering the form of the solution in Theorem 3.1 it is natural to ask if this

can be seen as a moving average of the kernel

with respect to the noise (

)

t∈R

, that

is, if



g(t −u) dZ



k=1

(t −u) dZ

, t ∈ R, (4.1)

for

= 1

,... , n

. The next result shows that the answer is positive if (

)

t∈R

is a “reason-

able” integrator for a suitable class of deterministic integrands for each k = 1,...,n.

Proposition 4.1.

Let

be the function given in

(3.1)

and suppose that, for all

y ∈ R

det

(

))

0. Suppose further that

has second moment and let (

)

t∈R

be the solution

(1.4)

given by

(3.3)

. Finally assume that, for each

= 1

,... , n

, there exists a linear map

: L

∩L

→ L

(P) which has the following properties:

4 · Examples and further results

(i) For all s < t, I

(s,t]

) = Z

−Z

(ii) If µ is a ﬁnite Borel measure on R having ﬁrst moment then



(t − · )µ(dr)



(t − · ))µ(dr) (4.2)

almost surely for all t ∈ R, where f

= 1

[0,∞)

( · −r) −1

[0,∞)

for r ∈ R.

Then it holds that

k=1

(t − · )), j = 1,. ..,n, (4.3)

almost surely for each

t ∈ R

. In this case, (

)

t∈R

will be called a regular integrator and we

will write

· dZ

= I

The typical example of a regular integrator is a multi-dimensional Lévy process:

Example 4.2.

Suppose that (

)

t∈R

is an

-dimensional integrable Lévy process. Then,

in particular, each (

)

t∈R

is an integrable (one-dimensional) Lévy process, and if

f ∈ L

∩L

the integral

(

)

is well-deﬁned in the sense of [22] and belongs

(

). (The latter fact is easily derived from [22, Theorem 3.3].) Moreover, the

stochastic Fubini result given in [2, Theorem 3.1] implies in particular that condition

(ii) of Proposition 4.1 is satisﬁed, which shows that (

)

t∈R

is a regular integrator and

that (4.1) holds.

We will now show that a class of multi-dimensional fractional Lévy processes

can serve as regular integrators as well (cf. Example 4.4 below). Fractional noise

processes are often used as a tool to incorporate (some variant of) long memory

in the corresponding solution process. As will appear, the integration theory for

fractional Lévy processes we will use below relies on the ideas of [17], but is extended

to allow for symmetric stable Lévy processes as well. For more on fractional stable

Lévy processes, the so-called linear fractional stable motions, we refer to [23, p. 343].

First, however, we will need the following observation:

Proposition 4.3.

Let

f : R → R

be a function in

∩L

for some

α ∈

2]. Then the

right-sided Riemann–Liouville fractional integral

−

f : t 7−→

Γ (β)

∞

f (u)(u −t)

β−1

du (4.4)

is well-deﬁned and belongs to L

for any β ∈ (0,1 −1/α).

Example 4.4.

Fix

,... , α

∈

2] and consider an

-dimensional Lévy process

(

)

t∈R

, where its

th coordinate (

)

t∈R

is symmetric

-stable if

∈

2) and

mean zero and square integrable if

= 2. Then, for a given vector

= [

,... , β

]

with

∈

−

/α

) for

= 1

,... , n

the corresponding fractional Lévy process (

)

t∈R

with parameter β is deﬁned entrywise as



−

(−∞,t]

−1

(−∞,0]

]



(u) dL

Γ (1 + β

)

(t −u)

−(−u)

(4.5)

Paper D · Multivariate stochastic delay diﬀerential equations and CAR representations of

CARMA processes

for t ∈ R and k = 1,...,n, where x

= max{x,0}. Proposition 4.3 shows that I

−

f ∈ L

for any

f ∈ L

∩L

, and hence we can deﬁne integration of

with respect to (

)

t∈R

through (L

)

t∈R

f (t) dZ



−



(t) dL

Note that the integral belongs to

(

) if

= 2 and to

(

) for any

γ < α

∈

2). While the integral clearly satisﬁes assumption (i) of Proposition 4.1 in

light of

(4.5)

, one can rely on the stochastic Fubini result for (

)

t∈R

given in [2,

Theorem 3.1] to verify that assumption (ii) is satisﬁed as well. Consequently, (

)

t∈R

is a regular integrator and the solution (

)

t∈R

(1.4)

takes the moving average

form (4.1).

At this point it should be clear that the conditions for being a regular integrator

are mild, hence they will, besides the examples mentioned above, also be satisﬁed for

a wide class of semimartingales with stationary increments.

4.2 Higher order (multivariate) SDDEs

An advantage of introducing the multivariate setting

(1.4)

is that we can nest higher

order MSDDEs in this framework. Eﬀectively, as usual and as will be demonstrated

below, it is done by increasing the dimension accordingly.

Let

,... , $

m−1

be (entrywise) ﬁnite

n ×n

measures concentrated on [0

,∞

)

which all admit second moment, and let (

)

t∈R

be an

-dimensional integrable

stochastic process with stationary increments. For convenience we will assume that

(

)

t∈R

is a regular integrator in the sense of Proposition 4.1. We will say that an

-dimensional stationary, integrable and measurable process (

)

t∈R

satisﬁes the

corresponding mth order MSDDE if it is m −1 times diﬀerentiable and

(m−1)

m−1

j=0

∗X

(j)

(t) dt + dZ

, t ∈ R, (4.6)

where (

(j)

)

t∈R

denotes the entrywise

th derivative of (

)

t∈R

with respect to

. By

(4.6) we mean that



(m−1)



−



(m−1)



m−1

j=0

l=1

[0,∞)



(j)

u−v



)

(dv) du + Z

−Z

for

= 1

,... , n

and each

s < t

almost surely. Equation

(4.6)

corresponds to the

dimensional MSDDE in (1.4) with noise [0,... , 0,Z

]

∈ R

and

η =







0 I

0 ··· 0

0 0 I

··· 0

0 0 0 ··· I

··· $

m−1







. (4.7)

(If n = 1 then η = $

.) With η given by (4.7) it follows that

D(η) =

m−1

j=0

D($

)

4 · Examples and further results

and

h(z) =







−I

0 ··· 0

0 zI

−I

··· 0

0 0 ··· zI

−I

−L[$

](z) −L[$

](z) ··· −L[$

m−2

](z) zI

−L[$

m−1

](z)







for

z ∈ D

(

). In general, we know from Theorem 3.1 that a solution to

(4.6)

exists if

deth(iy) , 0 for all y ∈R, and in this case the unique solution is given by

(t −u) dZ

, t ∈ R, (4.8)

where

[

] corresponds to entry (1

) in the

n ×n

block representation of

(

i ·

)

−1

In other words, if

denotes the

th canonical basisvector of

and

⊗

the Kronecker

product then

F [g

](y) = (e

⊗I

)

h(iy)

−1

⊗I

), y ∈ R.

However, due to the particular structure of

(4.7)

we can simplify these expres-

sions:

Theorem 4.5. Let the setup be as above. Then it holds that

deth(z) = det



−

m−1

j=0

L[$

](z)z



, z ∈ D(η), (4.9)

and if

deth

(

)

0 for all

y ∈ R

, there exists a unique solution to

(4.6)

and it is given as

(4.8) where g

: R → R

n×n

is characterized by

F [g

](y) =



(iy)

−

m−1

j=0

F [$

](y)(iy)



−1

, y ∈ R. (4.10)

The solution is causal if deth(z) , 0 whenever Re(z) ≥ 0.

Observe that, as should be the case, we are back to the ﬁrst order MSDDE when

= 1 and

(4.9)

–

(4.10)

agree with Theorem 3.1. As we will see in Section 4.3 below,

one motivation for introducing higher order MSDDEs of the form

(4.6)

and to study

the structure of the associated solutions, is their relation to MCARMA processes.

However, we start with the multivariate CAR(

) process, where no delay term will be

present, as an example:

Example 4.6.

Let

(

) =

p−1

···

z ∈ C

, for suitable

,... , A

∈ R

n×n

The associated CAR(

) process (

)

t∈R

with noise (

)

t∈R

can be thought of as formally

satisfying

(

)

t ∈ R

, where

denotes diﬀerentiation with respect to

Integrating both sides and rearranging terms gives

(p−1)

= −

p−1

j=0

p−j

(j)

dt + dZ

, t ∈ R, (4.11)

Paper D · Multivariate stochastic delay diﬀerential equations and CAR representations of

CARMA processes

which is of the form

(4.6)

with

and

−A

p−j

for

= 0

,... , p −

1. Proposi-

tion 4.5 shows that a unique solution exists if

det



(iy)

p−1

j=0

p−j

(iy)



= detP (iy) , 0

for all

y ∈ R

, and in this case

[

](

) =

(

)

−1

for

y ∈ R

. This agrees with the

rigorous deﬁnition of the CAR(

) process, see e.g. [19]. In case

= 1,

(4.11)

collapses

to the multivariate Ornstein–Uhlenbeck equation

= −A

dt + dZ

, t ∈ R,

and if the real part of all the eigenvalues of

are positive, it is easy to check that

(

) =

−A

[0,∞)

(

) so that the unique solution (

)

t∈R

is causal and takes the

well-known form

−∞

−A

(t−u)

, t ∈ R. (4.12)

Lévy-driven multivariate Ornstein–Uhlenbeck processes have been studied exten-

sively in the literature, and the moving average structure

(4.12)

of the solution is

well-known when (

)

t∈R

is a Lévy process. We refer to [1, 24, 25] for further de-

tails. The one-dimensional case where (

)

t∈R

is allowed to be a general stationary

increment process has been studied in [2].

4.3 Relations to MCARMA processes

Let p ∈ N and deﬁne the polynomials P ,Q: C → C

n×n

P (z) = I

+ A

p−1

+ ···+ A

and Q(z) = B

+ B

z + ···+ B

p−1

(4.13)

for

z ∈ C

and suitable

,... , A

,... , B

p−1

∈ R

n×n

. We will also ﬁx

q ∈ N

q < p

, and

set

and

= 0 for all

q < j < p

. It will always be assumed that

detP

(

)

for all

y ∈ R

. Under this assumption there exists a function

g : R → R

n×n

which is in

∩L

and satisﬁes

F [

g](y) = P (iy)

−1

Q(iy), y ∈ R. (4.14)

Consequently, for any regular integrator (

)

t∈R

in the sense of Proposition 4.1, the

n-dimensional stationary and integrable process (X

)

t∈R

given by

g(t −u) dZ

, t ∈ R, (4.15)

is well-deﬁned. If it is additionally assumed that

detP

(

)

0 for

z ∈ C

with

(

)

≥

then it is argued in [19] that

g(t) = (e

⊗I

)

E, t ≥ 0, (4.16)

where

A =







0 I

0 ··· 0

0 0 I

··· 0

0 0 ··· 0 I

−A

p−1

··· −A

−A







and E =













4 · Examples and further results

with E(z) = E

p−1

+ ···+ E

chosen such that

z 7−→P (z)E(z) −Q(z)z

is at most of degree

p −

1. (Above, and henceforth, we use the notation

for the

canonical basis vector of

.) We will refer to the process (

)

t∈R

as a (

)

t∈R

-driven

MCARMA(

p, q

) process. For instance, when (

)

t∈R

is an

-dimensional Lévy process,

(

)

t∈R

is a (Lévy-driven) MCARMA(

p, q

) process as introduced in [19]. If (

)

t∈R

an n-dimensional square integrable Lévy process with mean zero, and

Γ (1 + β

)

(t −u)

−(−u)

, t ∈ R,

for

∈

2) and

= 1

,... , n

, then (

)

t∈R

is an MFICARMA(

p, β,q

) process,

[

,... , β

], as studied in [18]. For the univariate case (

= 1), the processes above

correspond to the CARMA(

p, q

) and FICARMA(

p, β

) process, respectively. The class

of CARMA processes has been studied extensively, and we refer to the references in

the introduction for details.

Remark 4.7.

Observe that, generally, Lévy-driven MCARMA (hence CARMA) pro-

cesses are deﬁned even when (

)

t∈R

has no more than log moments. However, it

relies heavily on the fact that

and (

)

t∈R

are well-behaved enough to ensure that

the process in

(4.15)

remains well-deﬁned. At this point, a setup where the noise does

not admit a ﬁrst moment has not been integrated in a framework as general as that of

(1.4).

In the following our aim is to show that, under a suitable invertibility assumption,

the (

)

t∈R

-driven MCARMA(

p, q

) process given in

(4.15)

is the unique solution to

a certain (possibly higher order) MSDDE of the form

(4.6)

. Before formulating the

main result of this section we introduce some notation. To

and

deﬁned in

(4.13)

we will associate the unique polynomial

(

) =

p−q

p−q−1

···

z ∈ C

and C

,... , C

p−q−1

∈ R

n×n

, having the property that

z 7−→Q(z)R(z) −P (z) (4.17)

is a polynomial of at most order

q −

1 (see the introduction for an intuition about why

this property is desirable).

Theorem 4.8.

Let

and

be given as in

(4.13)

, and let (

)

t∈R

be the associated (

)

t∈R

driven MCARMA(

p, q

) process. Suppose that

detQ

(

)

0 for all

z ∈ C

with

(

)

≥

Then (X

)

t∈R

is the unique solution to (4.6) with

m = p −q, $

(du) = −C

(du) + f (u) du and $

= −C

for 1 ≤ j ≤ m −1 or, written out,

(m−1)

= −

m−1

j=0

(j)

dt +



∞

f (u)X

t−u



dt + dZ

, t ∈ R, (4.18)

where

,... , C

m−1

∈ R

n×n

are deﬁned as in

(4.17)

above, (

(j)

)

t∈R

is the

th derivative of

)

t∈R

, and where f : R → R

n×n

is characterized by

F [f ](y) = R(iy) −Q(iy)

−1

P (iy), y ∈ R. (4.19)

Paper D · Multivariate stochastic delay diﬀerential equations and CAR representations of

CARMA processes

It follows from Theorem 4.8 that

p−q

is the order of the (possibly multivariate) SDDE

we can associate with a (possibly multivariate) CARMA process. Thus, this seems as a

natural extension of [3], where the univariate ﬁrst order SDDE is studied and related

to the univariate CARMA(2,1) process.

Remark 4.9.

An immediate consequence of Theorem 4.8 is that we obtain an inver-

sion formula for (

)

t∈R

-driven MCARMA processes. In other words, it shows how

to recover the increments of (

)

t∈R

from observing (

)

t∈R

. For this reason it seems

natural to impose the invertibility assumption

detQ

(

)

0 for all

z ∈ C

with

(

)

≥

which is the direct analogue of the one for discrete time ARMA processes (or, more

generally, moving averages). It is usually referred to as the minimum phase property

in signal processing. The inversion problem for (Lévy-driven) CARMA processes has

been studied in [7, 8, 9, 21] and for (Lévy-driven) MCARMA processes in [11]. In

both cases, diﬀerent approaches that do not rely on MSDDEs are used.

Remark 4.10.

Since the Fourier transform

[

] of the function

deﬁned in Theo-

rem 4.8 is rational, one can determine

explicitly (e.g., by using the partial fraction

expansion of

[

]). Indeed, since the Fourier transform of

is of the same form

as the Fourier transform of the solution kernel

of the MCARMA process we can

deduce that

f (t) = (e

⊗I

)

F, t ≥ 0, (4.20)

with

B =







0 I

0 ··· 0

0 0 I

··· 0

0 0 ··· 0 I

−B

··· −B

q−2

−B

q−1







and F =













where F(z) = F

q−1

+ ···+ F

is chosen such that

z 7−→Q(z)F(z) −[Q(z)R(z) −P (z)]z

is at most of degree q −1 (see (4.14) and (4.16)).

In Corollary 4.11 we formulate the prediction formula in Theorem 3.2 in the

special case where (

)

t∈R

is a (

)

t∈R

-driven MCARMA process. In the formulation

we use the deﬁnition

= E[Z

−Z

| Z

−Z

, r < s], u > s,

in line with (3.5).

Corollary 4.11. Let (X

)

t∈R

be a (Z

)

t∈R

-driven MCARMA process and set

(t) = (e

⊗I

)

p−q

k=j

k−j

, t ≥ 0,

for

= 1

,... , p −q

, where

,... , C

p−q−1

are given in

(4.17)

and

p−q

. Suppose that

detP

(

)

0 and

detQ

(

)

0 for all

z ∈ C

with

(

)

≥

0. Fix

s < t

. Then the following

5 · Proofs and auxiliary results

prediction formula holds

E[X

| X

, u ≤ s] =

p−q

j=1

(t −s)X

(j−1)

−∞

g(t −u)f (u −v) du X

dv +

g ∗{

(s,∞)

}(t),

where

g and f are given in (4.16) and (4.20), respectively, and

g ∗{

(s,∞)

}(t) = 1

{p=q+1}

+ (e

⊗I

)

−Av

dv.

Example 4.12.

To illustrate the results above we will consider an

-dimensional

)

t∈R

-driven MCARMA(3, 1) process (X

)

t∈R

with P and Q polynomials given by

P (z) = I

+ A

z + A

and Q(z) = B

+ I

for matrices

∈ R

n×n

, such that

detP

(

)

0 and

detQ

(

)

0 for all

z ∈ C

with Re(z) ≥0. According to (4.16), (X

)

t∈R

may be written as

−∞

⊗I

)

A(t−u)

E dZ

, t ∈ R,

where

= 0,

and

−A

. With

−B

(

−A

) and

F = B

−B

)) −A

, Theorem 4.8 and Remark 4.10 imply that

(1)

= −C

(1)

dt −C

dt +



∞

−B

t−u



dt + dZ

, t ∈ R.

Moreover, by Corollary 4.11, we have the prediction formula

E[X

| X

, u ≤ s] = (e

⊗I

)



(EC

+ AE)X

+ EX

(1)

−Au



−∞

−B

dv +





5 Proofs and auxiliary results

We will start this section by discussing some technical results. These results will

then be used in the proofs of all the results stated above. Recall the deﬁnition of

h: D

(

)

→ C

n×n

(3.1)

. Note that we always have

{z ∈ C

(

)

≥

} ⊆ D

(

) and

(

) =

iy −F

[

](

) for

y ∈ R

. Provided that

is suﬃciently nice, Proposition 5.1

below ensures the existence of a kernel

g : R → R

n×n

which will drive the solution

to (1.4).

Proposition 5.1.

Let

be given as in

(3.1)

and suppose that

deth

(

)

0 for all

y ∈ R

Then there exists a function g = [g

]: R → R

n×n

in L

characterized by

F [g](y) = h(iy)

−1

, y ∈ R. (5.1)

Moreover, the following statements hold:

Paper D · Multivariate stochastic delay diﬀerential equations and CAR representations of

CARMA processes

(i) The function g satisﬁes

g(t −r) −g(s −r) = 1

(s,t]

(r)I

g ∗η(u −r) du

for almost all r ∈ R and each ﬁxed s < t.

(ii) If η has moment of order p ∈ N, then g ∈ L

for all q ∈ [1/p,∞], and

g(t) = 1

[0,∞)

(t)I

−∞

g ∗η(u) du (5.2)

for almost all t ∈ R. In particular,

g ∗η(u) du = −I

. (5.3)

(iii)

[0,∞)

δu

|η

(

)

< ∞

for all

j,k

= 1

,... , n

and some

δ >

0, then there exists

ε >

such that

sup

t∈R

max

j,k=1,...,n

(t)|e

ε|t|

≤ C

for a suitable constant C > 0.

(iv)

deth

(

)

0 for all

z ∈ C

with

(

)

≥

0, then

is vanishing on (

−∞,

0) almost

everywhere.

Proof. In order to show the existence of g it suﬃces to argue that



y 7−→

h(iy)

−1



∈ L

for j,k = 1,...,n, (5.4)

since the Fourier transform

maps

onto

. (Here [

(

)

−1

]

refers to the (

j,k

)-th

entry in the matrix h(iy)

−1

.) Indeed, in this case we just set g

= F

−1

[[h(i · )

−1

]

Let

(

) denote the

n ×n

matrix which has the same rows as

(

), but where the

th column is replaced by the

th canonical basis vector (that is, the vector with all

entries equal to zero except of the

th entry which equals one). Then it follows by

Cramer’s rule that

h(iy)

−1

detH(y)

deth(iy)

, y ∈ R.

Recalling that

(

) =

iy − F

[

](

) and that

[

](

) is bounded in

we get by

the Leibniz formula that

|detH

(

)

| ∼ |y|

and

|detA

(

)

(

|y|

n−1

) as

|y|→ ∞

. This

shows in particular that



h(iy)

−1



= O



|y|

−1



, |y|→ ∞. (5.5)

Since

and

were arbitrarily chosen we get by continuity of (all the entries of)

y 7→ h

(

)

−1

that

(5.4)

holds, which ensures the existence part. The fact that

F [g](−y)

F [g](y), y ∈ R, implies that g takes values in R

n×n

To show (i), we ﬁx s < t and apply the Fourier transform to obtain

g(t − · ) −g(s − · ) −

g ∗η(u − · ) du

(y)

= (e

−ity

−e

−isy

)F [g](−y) −F [1

(s,t]

](y)F [g](−y)F [η](−y)

= F [1

(s,t]

](y)h(−iy)

−1

(−I

iy −F [η](−y))

= F [1

(s,t]

](y)I

5 · Proofs and auxiliary results

which veriﬁes the result.

We will now show (ii) and for this we suppose that

has a moment of order

p ∈ N

Then it follows that

h: y 7→ h

(

) is (entrywise)

times diﬀerentiable with the

derivative given by

iδ

({m −1}∩{j −k}) −(−i)

[0,∞)

iuy

(du)

, m = 1,...,p,

and in particular all the the entries of (

)(

) are bounded in

. Observe that, clearly,

if a function A: R → C

n×n

takes the form

A(t) = B(t)C(t)D(t), t ∈ R, (5.6)

where all the entries of

B,D : R → C

n×n

decay at least as

|y|

−1

|y|→ ∞

and all the

entries of

C : R → C

n×n

are bounded, then all the entries of

decay at least as

|y|

−1

|y|→ ∞. Using the product rule for diﬀerentiation and the fact that

−1

)(y) = −

h(y)

−1

h)(y)

h(y)

−1

, y ∈ R,

it follows recursively that

−1

is a sum of functions of the form

(5.6)

, and thus

all its entries decay at least as

|y|

−1

|y| → ∞

for

= 1

,... , p

. Since the entries of

−1

are continuous as well, they belong to

and we can use the inverse Fourier

transform

−1

to conclude that

−1

[

] =



t 7→

(

−it

)

(

)



is an

function. This

implies in turn that t 7→ g

(t)(1 + |t|)

∈ L

and, thus,

(t)|

dt ≤





(t)(1 + |t|)







(1 + |t|)

−

2pq

2−q



1−

< ∞

for any

q ∈

/p,

2) and

j,k

= 1

,... , n

. By using the particular observation that

g ∈ L

and (i) we obtain that

g(t) = 1

[0,∞)

(t)I

−∞

g ∗η(u) du (5.7)

for (almost) all t ∈ R. This shows that

(t)| ≤ 1 +

|[g ∗η(u)]

| du ≤ 1 +

l=1

(u)| du |η

|([0,∞))

for all

t ∈ R

and for every

j,k

= 1

,... , n

which implies

g ∈L

∞

and, thus,

g ∈L

for all

q ∈ [1/p,∞]. Since g(t) →0 entrywise as t → ∞, we get by (5.7) that

g ∗η(u) du = −I

which concludes the proof of (ii).

Now suppose that

[0,∞)

δu

|η

(

)

< ∞

for all

j,k

= 1

,... , n

and some

δ >

0. In

this case, S

B {z ∈ C : Re(z) ∈[−δ,δ]} ⊆ D(η) and

z 7−→deth(z) = det



z −

[0,∞)

η(du)



Paper D · Multivariate stochastic delay diﬀerential equations and CAR representations of

CARMA processes

is strictly separated from 0 when

z ∈ S

and

|z|

is suﬃciently large. Indeed, the

dominating term in deth(z) is z

when |z| is large, since



[0,∞)

η(du)



≤ max

l,m=1,...,n

[0,∞)

δu

|η

|(du) for j,k = 1,...,n.

Using this together with the continuity of

z 7→ deth

(

) implies that there exists

δ ∈

,δ

] so that

z 7→ deth

(

) is strictly separated from 0 on

B {z ∈ C

(

)

∈

[

−

δ,

]

}

In particular,

z 7→

[

(

)

−1

]

is bounded on any compact set of

, and by using

Cramer’s rule and the Leibniz formula as in

(5.5)

we get that

[

(

)

−1

]

(

|z|

−1

) as

|z| → ∞ provided that z ∈ S

. Consequently,

sup

x∈[−

δ,

δ]



h(x + iy)

−1



dy < ∞,

and this implies that

t 7→ g

(

)

εt

∈ L

for all

ε ∈

(

−

δ,

). This implication is a slight

extension of the characterization of Hardy functions given in [13, Theorem 1 (Sec-

tion 3.4)]; a general statement and the corresponding proof can be found in [3,

Lemma 4.1].

Now ﬁx any

ε ∈

) and

j,k ∈ {

,... , n}

, and observe from

(5.7)

that

is abso-

lutely continuous on both [0

,∞

) and (

−∞,

0) with density [

g ∗η

]

. Consequently, for

ﬁxed t > 0, integration by parts yields

(t)|e

εt

≤ |g

(0)|+



[g ∗η(u)]



εu

du + ε

(u)|e

εu

du. (5.8)

Since



[g ∗η(u)]



εu

du ≤

l=1

(u)|e

εu

[0,∞)

εu

|η

|(du)

it follows from (5.8) that

max

j,k=1,...,n

(t)| ≤ Ce

−εt

for all t > 0 with

C B 1 + max

j,k=1,...,n



l=1

(u)|e

ε|u|

[0,∞)

εu

|η

|(du) + ε

(u)|e

ε|u|



By considering

−ε

rather than

in the above calculations one reaches the conclusion

that

max

j,k=1,...,n

(t)| ≤ Ce

εt

, t < 0,

and this veriﬁes (iii).

Finally, suppose that

deth

(

)

0 for all

z ∈ C

with

(

)

≥

0. Then it holds that

and, thus,

z 7→ h

(

)

−1

is continuous on

{z ∈ C

(

)

≥

}

and analytic on

{z ∈

(

)

}

. Moreover, arguments similar to those in

(5.5)

show that



[h(z)

−1

]



O(|z|

−1

) as |z| → ∞, and thus we may deduce that

sup

x>0

|(h(x + iy)

−1

)

| dy < ∞.

From the theory on Hardy spaces, see [12] or [13, Section 3.4], this implies that

vanishing on (−∞,0) almost everywhere, which veriﬁes (iv) and ends the proof. 

5 · Proofs and auxiliary results

From Proposition 5.1 it becomes evident that we may (and, hence, do) choose the ker-

nel

to satisfy

(5.2)

pointwise, so that the function induces a ﬁnite Lebesgue–Stieltjes

measure

(

). We summarize a few properties of this measure in the corollary below.

Corollary 5.2.

Let

be the function introduced in

(3.1)

and suppose that

deth

(

)

for all

y ∈ R

. Suppose further that

has ﬁrst moment. Then the kernel

g : R → R

n×n

characterized in

(5.1)

induces an

n ×n

ﬁnite Lebesgue–Stieltjes measure, which is given by

g(dt) = I

(dt) + g ∗η(t) dt. (5.9)

A function f = [f

]: R → C

m×n

is in L

(g(dt)) if



(t)[g ∗η]

(t)



dt < ∞, l = 1,...,n,

for

= 1

,... , m

and

= 1

,... , n

. Moreover, the measure

(

) has (

p−

1)th moment whenever

η has pth moment for any p ∈ N.

Proof.

The fact that

induces a Lebesgue–Stieltjes measure of the form

(5.9)

is an

immediate consequence of

(5.2)

. For a measurable function

= [

]

: R → C

m×n

to be

integrable with respect to

(

) = [

(

)] we require that

∈ L

(

)),

= 1

,... , n

for each choice of

= 1

,... , m

and

= 1

,... , n

. Since the variation measure

(

) of

(dt) is given by

|(dt) = δ

({l −k})δ

(dt) +



[g ∗η(t)]



dt,

we see that this condition is equivalent to the statement in the result. Finally, suppose

that η has pth moment for some p ∈ N. Then, for any j,k ∈ {1,...,n}, we get that

|t|

p−1

|(dt) ≤

l=1



|η

|([0,∞))

p−1

(t)| dt

[0,∞)

|t|

p−1

|η

|(dt)

(t)| dt



From the assumptions on

and Proposition 5.1(ii) we get immediately that

|η

([0

,∞

)),

[0,∞)

|t|

p−1

|η

(

) and

(

)

| dt

are ﬁnite for all

= 1

,... , n

. Moreover, for any such

l we compute that

p−1

(t)| dt

≤

|t|≤1

p−1

(t)| dt +



|t|>1

−2





|t|>1

(t))



which is ﬁnite, since (

t 7→ t

(

))

∈ L

according to the proof of Proposition 5.1(ii),

and hence we have shown the last part of the result. 

We now give a result that both will be used to prove the uniqueness part of Theo-

rem 3.1 and Theorem 3.2.

Paper D · Multivariate stochastic delay diﬀerential equations and CAR representations of

CARMA processes

Lemma 5.3.

Suppose that

deth

(

)

0 for all

y ∈ R

and that

is a ﬁnite measure with

second moment, and let

be given by

(3.2)

. Furthermore, let (

)

t∈R

be a measurable

process, which is bounded in

(

) and satisﬁes

(1.5)

almost surely for all

s < t

. Then, for

each s ∈ R and almost surely,

= g(t −s)X

∞

g(t −u)η ∗{1

(−∞,s]

X}(u) du + g ∗{1

(s,∞)

(Z −Z

)}(t)

(5.10)

for Lebesgue almost all t > s, using the notation

η ∗{1

(t) B

k=1

[0,∞)

(t −u)X

t−u

(du)

and

g ∗{1

(s,∞)

(Z −Z

)}

(t) B

k=1

(s,∞)

(t −u)



t−u

−Z



(du)

for j = 1,. ..,n and t ∈ R.

Proof.

By arguments similar to those in the proof of Proposition 5.1(iii) we get that

the assumption

deth

(

)

0 for all

y ∈ R

implies that we can choose

δ ∈

,ε

), such

that deth(z) , 0 for all z ∈ C with 0 ≤Re(z) < δ and

sup

0<x<δ



h(x + iy)

−1



dy < ∞ for all j,k = 1,...,n.

Thus, a slight extension of [13, Theorem 1 (Section 3.4)] (which can be found in [3,

Lemma 4.1]) ensures that

[

](

) =

(

)

−1

when

(

)

∈

,δ

). From this point we will

ﬁx such z and let s ∈ R be given. Since (X

)

t∈R

satisﬁes (1.4),

(s,∞)

(t)X

= 1

(s,∞)

(t)X

−∞

(s,∞)

(u)η ∗X(u) du + 1

(s,∞)

(t)(Z

−Z

)

for Lebesgue almost all

t ∈ R

outside a

-null set (this is a consequence of Tonelli’s

theorem). In particular, this shows that

zL[1

(s,∞)

X](z)

= z

L[1

(s,∞)

](z) + L



−∞

(s,∞)

(u)η ∗X(u) du



(z)

+ L[1

(s,∞)

(Z −Z

)](z)

= L[X

( · −s)](z) + L[1

(s,∞)

η ∗X](z) + zL[1

(s,∞)

(Z −Z

)](z).

By noticing that

L[1

(s,∞)

η ∗X](z) = L

(s,∞)

η ∗{1

(−∞,s]

(z) + L

η ∗{1

(s,∞)

(z)

= L

(s,∞)

η ∗{1

(−∞,s]

(z) + L[η](z)L[1

(s,∞)

X](z),

it follows that

h(z)L[1

(s,∞)

X](z)

= L

( · −s) + 1

(s,∞)

η ∗{1

(−∞,s]

(z) + zL[1

(s,∞)

(Z −Z

)](z).

5 · Proofs and auxiliary results

(Observe that, since both (

)

t∈R

and (

)

t∈R

are bounded in

(

), the Laplace trans-

forms above are all well-deﬁned almost surely. We refer to the beginning of the proof

of Theorem 3.1, where details for a similar argument are given.) Now, by using that

L[g](z) = h(z)

−1

, we ﬁnd

zh(z)

−1

L[1

(s,∞)

(Z −Z

)](z) = L[g(dt)](z)L[1

(s,∞)

(Z −Z

)](z)

= L

g ∗{1

(s,∞)

(Z −Z

)}

(z)

and, thus,

= g(t −s)X

∞

g(t −u)η ∗{1

(−∞,s]

X}(u) du + g ∗{1

(s,∞)

(Z −Z

)}

for Lebesgue almost all t > s with probability one. 

With Lemma 5.3 in hand we are now ready to prove the general result, Theorem 3.1,

for existence and uniqueness of solutions to the MSDDE (1.4).

Proof of Theorem 3.1.

Fix

t ∈ R

. The convolution in

(3.3)

is well-deﬁned if

u 7→ Z

t−u

-integrable (by Corollary 5.2) which means that

u 7→ Z

t−u

belongs to

(

))

for all

j,k

= 1

,... , n

. Observe that, since (

)

u∈R

is integrable and has stationary

increments, [2, Corollary A.3] implies that there exists

α,β >

0 such that

[

]

≤

α + β|u| for all u ∈ R. Consequently,



t−u

|µ(du)



≤ (α + β|t|)µ(R) + β

|u|µ(du) < ∞

for any (non-negative) measure

which has ﬁrst moment. This shows that

u 7→ Z

t−u

will be integrable with respect to such measure almost surely, in particular with

respect to |g

|(du) for j = 1,...,n (according to Corollary 5.2).

We will now argue that (

)

t∈R

deﬁned by

(3.3)

does indeed satisfy

(1.4)

, and thus

we ﬁx s < t. Due to the fact that

∗η

(u) du =

∗η

(u) du +



g ∗η(r)Z

· −r



∗η

(u) du

it is clear by the deﬁnition of (X

)

t∈R

that it suﬃces to argue that



g ∗η(r)Z

· −r



∗η

(u) du

[g ∗η(t −r) −g ∗η(s −r)]

dr −

∗η

(r) dr.

Paper D · Multivariate stochastic delay diﬀerential equations and CAR representations of

CARMA processes

We do this componentwise, so we ﬁx i ∈ {1, . ..,n} and compute that





g ∗η(r)Z

· −r



∗η

(u) du



j=1

k=1

l=1



∗η

(v)Z

· −r



∗η

(u) du

j=1

k=1

l=1

[0,∞)

(u −v −r −w)η

(dv) du η

(dw) dr

k=1

l=1

[0,∞)

(g ∗η)

(u −r −w) du η

(dw) dr

k=1

l=1



[0,∞)

(t −r −w) −g

(s −r −w)]η

(dw) dr

−

[0,∞)

({i −l})1

(s,t]

(r + w)η

(dw) dr



k=1



[(g ∗η)

(t −r) −(g ∗η)

(s −r)] dr −

∗η

(r) dr





[g ∗η(t −r) −g ∗η(s −r)]

dr −

∗η

(r) dr



where we have used (i) in Proposition 5.1 and the fact that

and

commute in a

convolution sense, g ∗η = (g

∗η

)

(compare the associated Fourier transforms).

Next, we need to argue that (

)

t∈R

is stationary. Here we will use

(5.3)

to write

the solution as

g ∗η(u)[Z

t−u

−Z

] du

for each

t ∈ R

. Fix

m ∈ R

. Let

−m

< t

< ··· < t

be a partition of [

−m,m

] with

max

j=1,...,k

−t

j−1

) → 0, k → ∞, and deﬁne the Riemann sum

m,k

j=1

g ∗η(t

j−1

)[Z

t−t

j−1

−Z

](t

−t

j−1

Observe that (

m,k

)

t∈R

is stationary. Moreover, the

th component of

m,k

converges

to the ith component of

−m

g ∗η(u)[Z

t−u

−Z

] du

in L

(P) as k → ∞. To see this, we start by noting that



]

−[X

m,k

]



≤

j=1

l=1

l−1

]

(u)E



[g ∗η]

(u)[Z

t−u

−Z

]

−[g ∗η]

l−1

)[Z

t−t

l−1

−Z

]



du.

100

5 · Proofs and auxiliary results

Then, for each j ∈ {1, ...,n},

max

l=1,...,k

l−1

]

(u)E



[g ∗η]

(u)[Z

t−u

−Z

] −[g ∗η]

l−1

)[Z

t−t

l−1

−Z

]



≤ max

l=1,...,k

l−1

]

(u)



|(g ∗η)

(u)|E



t−u

−Z

t−t

l−1



+ E



t−t

l−1

−Z



[g ∗η]

(u) −[g ∗η]

l−1

)





→ 0

k → ∞

for almost all

u ∈ R

using that (

)

t∈R

is continuous in

(

) (cf. [2,

Corollary A.3]) and that [

g ∗η

]

is càdlàg. Consequently, Lebesgue’s theorem on

dominated convergence implies that

m,k

→ X

entrywise in

(

) as

k → ∞

, thus

(

)

t∈R

inherits the stationarity property from (

m,k

)

t∈R

. Finally, since

→ X

(entrywise) almost surely as m → ∞, we obtain that (X

)

t∈R

is stationary as well.

To show the uniqueness part, we let (

)

t∈R

and (

)

t∈R

be two stationary, inte-

grable and measurable solutions to

(1.4)

. Then

B U

−V

t ∈ R

, is bounded in

(

) and satisﬁes an MSDDE without noise. Consequently, Lemma 5.3 implies that

= g(t −s)X

∞

g(t −u)η ∗{1

(−∞,s]

X}(u) du

holds for each s ∈ R and Lebesgue almost all t > s. For a given j we thus ﬁnd that



≤ C

k=1



(t −s)|+

l=1

∞

(t −u)||η

|([u −s,∞)) du



where

C B max

[

]. It follows by Proposition 5.1(ii) that

(

) converges

t → ∞

, and since

g ∈ L

it must be towards zero. Using this fact together with

Lebesgue’s theorem on dominated convergence it follows that the right-hand side of

the expression above converges to zero as

tends to

−∞

, from which we conclude

that

almost surely for Lebesgue almost all

. By continuity of both processes

in L

(P) (cf. [2, Corollary A.3]), we get the same conclusion for all t.

Finally, under the assumption that

deth

(

)

0 for

z ∈ C

with

(

)

≥

0 it follows

from Proposition 5.1(iv) that g ∗η is vanishing on (−∞,0), and hence we get that the

solution (X

)

t∈R

deﬁned by (3.3) is causal since

= Z

∞

g ∗η(u)Z

t−u

du = −

∞

g ∗η(u)[Z

−Z

t−u

] du, t ∈ R,

by (5.3). 

Proof of Theorem 3.2. Since (X

)

t∈R

is a solution to an MSDDE,

σ(X

: u ≤ s) = σ(Z

−Z

: u ≤ s)

and the theorem therefore follows by Lemma 5.3. 

Proof of Proposition 4.1.

We start by arguing why

(4.2)

is well-deﬁned. To see that

this is the case, note initially that

(

t − ·

)) =

−Z

t−r

and thus, since (

)

t∈R

integrable and has stationary increments,

[

(

t − ·

))

]

≤ α

β|r|

for all

r ∈ R

and

suitably chosen α,β > 0 (see, e.g., [2, Corollary A.3]). In particular,



(t − · ))||µ|(dr)



≤ α|µ|(R) + β

|r||µ|(dr) < ∞,

101

Paper D · Multivariate stochastic delay diﬀerential equations and CAR representations of

CARMA processes

and thus

(

t − ·

)) is integrable with respect to

and the right-hand side of

(4.2)

is well-deﬁned almost surely for each

t ∈ R

. To show that the left-hand side is well-

deﬁned, it suﬃces to note that

u 7→

(

)

(

) belongs to

∩L

by an application

of Jensen’s inequality and Tonelli’s theorem.

To show

(4.3)

, ﬁx

t ∈ R

and

j,k ∈ {

,... , n}

, and note that

(

) = [

g ∗η

]

(

)

is a

ﬁnite measure with having ﬁrst moment according to Corollary 5.2. Consequently,

we can use assumptions (i)–(ii) on I

to get

[g ∗η]

(r)

t−r

−Z

dr =

(t,t−r]

)[g ∗η]

(r) dr

= I



(t,t−r]

[g ∗η]

(r) dr



= I



({j −k})1

[0,∞)

(t − · ) +

t− ·

−∞

[g ∗η]

(u) du



= I

(t − · ))

using

(5.2)

and the convention that

(a,b]

−1

(b,a]

when

a > b

. By combining this

relation with (5.3) and (3.3) we obtain

k=1

[g ∗η]

(r)

t−r

−Z

dr =

k=1

(t − · )),

which was to be shown. 

Proof of Proposition 4.3.

Let

α ∈

2] and

β ∈

−

/α

), and consider a function

f : R → R in L

∩L

. We start by writing

∞

|f (u)|(u −t)

β−1

du =

|f (t + u)|u

β−1

du +

∞

|f (t + u)|u

β−1

du.

For the left term we ﬁnd that



|f (t + u)|u

β−1



dt ≤



β−1



α−1

|f (t + u)|

β−1

du dt



β−1



|f (t)|

dt < ∞.

For the right term we ﬁnd



∞

|f (t + u)|u

β−1



dt ≤



f (u) du



α−1

∞

|f (t + u)|u

α(β−1)

du dt



f (u) du



∞

α(β−1)

du < ∞.

We conclude that (I

−

f )(u) ∈ L

. 

Proof of Theorem 4.5.

The identity

(4.9)

is just a matter of applying standard com-

putation rules for determinants. For instance, one may prove the result when

z ,

0 by

induction using the block representation

h(z) =

A B

C D

(5.11)

102

5 · Proofs and auxiliary results

with A = I

z, B = −(e

⊗I

)

∈ R

n×(m−1)n

, C = −e

m−1

⊗L[$

](z) ∈R

(m−1)n×n

and

D =







z −I

0 ··· 0

0 I

z −I

··· 0

0 0 ··· I

z −I

−L[$

](z) −L[$

](z) ··· −L[$

m−2

](z) I

z −L[$

m−1

](z)







Here

and

m−1

refer to the ﬁrst and last canonical basis vector of

m−1

, respectively.

The case where

= 0 follows directly from the Leibniz formula. In case

deth

(

)

for all

y ∈ R

, we may write

(

)

−1

as an

m ×m

matrix, where each element [

(

)

−1

]

is an

n ×n

matrix. Then we know from Theorem 3.1 that the unique solution to

(4.6)

is a (

)

t∈R

-driven moving average of the form

(4.8)

with

[

](

) = [

(

)

−1

]

Similar to the computation of

deth

(

), when

(

) is invertible, block (1

) of

(

)

−1

can inductively be shown to coincide with



−

m−1

j=0

L[$

](z)z



−1

using the representation

(5.11)

and standard rules for inverting block matrices. This

means in particular that (4.10) is true. 

Proof of Theorem 4.8.

We start by arguing that there exists an integrable function

, which is vanishing on (

−∞,

0) and has Fourier transform given by

(4.19)

. Note that,

since

z 7→ detQ

(

) is just a polynomial (of order

), the assumption that

detQ

(

)

whenever Re(z) ≥ 0 implies in fact that

H(z) B R(z) −Q(z)

−1

P (z) = Q(z)

−1

[Q(z)R(z) −P (z)]

is well-deﬁned for all

z ∈ S

B {x

x ≥ −δ, y ∈ R}

for a suitably chosen

δ >

0. By a

slight modiﬁcation of [13, Theorem 1 (Section 3.4)], or by [3, Lemma 4.1], it suﬃces

to argue that there exists ε ∈ (0,δ] such that

sup

x>−ε

|H(x + iy)

dy < ∞ for all j,k = 1,...,n. (5.12)

Let k · k denote any sub-multiplicative norm on C

n×n

and note that

|H(z)

| ≤ kQ(z)

−1

kkQ(z)R(z) −P (z)k.

Thus, since

(

)

(

)

− P

(

)

k ∼ c

|z|

q−1

and

(

)

−1

k ∼ c

|z|

−q

|z| → ∞

for some

≥

1 (the former by the choice of

and the latter by Cramer’s rule),

(

)

(

|z|

−1

). Consequently, the continuity of

ensures that

(5.12)

is satisﬁed for a suit-

able

ε ∈

,δ

], and we have established the existence of

with the desired Fourier

transform. This also establishes that the

n ×n

measures

,... , $

p−q−1

deﬁned as

in the statement of the theorem are ﬁnite and have moments of any order. Associate

to these measures the

(

p −q

)

×n

(

p −q

) measure

given in

(4.7)

. Then it follows from

(4.9) that

deth(iy) = det



(iy)

p−q

p−q−1

j=0

(iy)

−F [f ](y)



detP (iy)

detQ(iy)

103

Paper D · Multivariate stochastic delay diﬀerential equations and CAR representations of

CARMA processes

and hence

deth

(

) is non-zero for all

y ∈ R

. In light of Proposition 4.5, in particular

(4.10)

, we may therefore conclude that the unique solution to

(4.6)

is a (

)

t∈R

-driven

moving average, where the driving kernel has Fourier transform



(iy)

p−q

p−q−1

j=0

(iy)

−F [f ](y)



−1

= P (iy)

−1

Q(iy), y ∈ R.

In other words, the unique solution is the (

)

t∈R

-driven MCARMA process associated

to the polynomials P and Q. 

Before giving the proof of Corollary 4.11 we will need the following lemma:

Lemma 5.4. Let C

,... , C

p−q−1

be given in (4.17) and C

p−q

= I

. Deﬁne

(z) =

p−q

k=j

k−j

, j = 1,..., p −q −1.

Then

p − q −

2 times diﬀerentiable and

p−q−2

has a density with respect to the

Lebesgue measure which we denote D

p−q−1

g. Furthermore, we have that

p−q

⊗I

)

g = [

(D),. ..,

p−q−1

(D),

g] (5.13)

where

(D)(t) =

p−q

k=j

k−j

g(t)C

= 1

[0,∞)

(t)(e

⊗I

)

p−q

k=j

k−j

(5.14)

for

= 1

,... , p − q −

1 and

g : R → R

n×n

is characterized by

[

](

) =

(

)

−1

with

h: C → C

n(p−q)×n(p−q)

given by

h(z) =







z −I

0 ··· 0

0 I

z −I

··· 0

0 0 ··· I

z −I

−1

(z)P (z) −zR

(z) C

··· C

p−q−2

z + C

p−q−1







Proof.

The fact that

p −q −

2 times diﬀerentiable and

p−q−2

has a density with

respect to the Lebesgue measure follows form the relation in

(5.2)

. Furthermore, by

Theorem 4.8 we know that

[

](

) =

(

)

−1

(

). Consequently,

(5.13)

follows since

[P (iy)

−1

Q(iy)R

(iy),

..., P (iy)

−1

Q(iy)R

p−q−1

(iy),P (iy)

−1

Q(iy)]h(z) = (e

p−q

⊗I

)

The relation in (5.14) is due to the representation of

g given in (4.16). 

Proof of Corollary 4.11.

The prediction formula is a consequence of Lemma 5.4

combined with Theorems 3.2 and 4.8. Furthermore, to get the expression for

g ∗

{

(s,∞)

}, note that

g(dv) = 1

{p=q+1}

(dv) + (e

⊗I

)

AE dv,

which follows from the representation of

g in (4.16). 

104

References

Acknowledgments

This work was supported by the Danish Council for Independent Research (grant

DFF–4002–00003).

References

[1]

Barndorﬀ-Nielsen, O.E., J.L. Jensen and M. Sørensen (1998). Some stationary

processes in discrete and continuous time. Adv. in Appl. Probab. 30(4), 989–

1007. doi: 10.1239/aap/1035228204.

[2]

Barndorﬀ-Nielsen, O.E. and A. Basse-O’Connor (2011). Quasi Ornstein–Uhlen-

beck processes. Bernoulli 17(3), 916–941. doi: 10.3150/10-BEJ311.

[3]

Basse-O’Connor, A., M.S. Nielsen, J. Pedersen and V. Rohde (2019). Stochastic

delay diﬀerential equations and related autoregressive models. Stochastics.

Forthcoming. doi: 10.1080/17442508.2019.1635601.

[4]

Benth, F.E., J. Šaltyt

e-Benth and S. Koekebakker (2007). Putting a price on

temperature. Scand. J. Statist. 34(4), 746–767. doi:

10.1111/j.1467-9469.200

7.00564.x.

[5]

Box, G.E.P and G.M Jenkins (1970). Times series analysis. Forecasting and control.

Holden-Day, San Francisco, Calif.-London-Amsterdam.

[6]

Brockwell, P.J. (2001). Lévy-driven CARMA processes. Ann. Inst. Statist. Math.

53(1). Nonlinear non-Gaussian models and related ﬁltering methods (Tokyo,

2000), 113–124. doi: 10.1023/A:1017972605872.

[7]

Brockwell, P.J. (2014). Recent results in the theory and applications of CARMA

processes. Ann. Inst. Statist. Math. 66(4), 647–685. doi:

10.1007/s10463-014-

0468-7.

[8]

Brockwell, P.J., R.A. Davis and Y. Yang (2011). Estimation for non-negative

Lévy-driven CARMA processes. J. Bus. Econom. Statist. 29(2), 250–259. doi:

10.1198/jbes.2010.08165.

[9]

Brockwell, P.J. and A. Lindner (2015). Prediction of Lévy-driven CARMA

processes. J. Econometrics 189(2), 263–271.

[10]

Brockwell, P.J. and T. Marquardt (2005). Lévy-driven and fractionally inte-

grated ARMA processes with continuous time parameter. Statist. Sinica 15(2),

477–494.

[11]

Brockwell, P.J. and E. Schlemm (2013). Parametric estimation of the driving

Lévy process of multivariate CARMA processes from discrete observations. J.

Multivariate Anal. 115, 217–251. doi: 10.1016/j.jmva.2012.09.004.

[12]

Doetsch, G. (1937). Bedingungen für die Darstellbarkeit einer Funktion als

Laplace-integral und eine Umkehrformel für die Laplace-Transformation. Math.

Z. 42(1), 263–286. doi: 10.1007/BF01160078.

105

Paper D · Multivariate stochastic delay diﬀerential equations and CAR representations of

CARMA processes

[13]

Dym, H. and H.P McKean (1976). Gaussian processes, function theory, and the

inverse spectral problem. Probability and Mathematical Statistics, Vol. 31. New

York: Academic Press [Harcourt Brace Jovanovich Publishers].

[14]

García, I., C. Klüppelberg and G. Müller (2011). Estimation of stable CARMA

models with an application to electricity spot prices. Stat. Model. 11(5), 447–

470. doi: 10.1177/1471082X1001100504.

[15]

Gripenberg, G. and I. Norros (1996). On the prediction of fractional Brownian

motion. J. Appl. Probab. 33(2), 400–410.

[16]

Gushchin, A.A. and U. Küchler (2000). On stationary solutions of delay dif-

ferential equations driven by a Lévy process. Stochastic Process. Appl. 88(2),

195–211. doi: 10.1016/S0304-4149(99)00126-X.

[17]

Marquardt, T. (2006). Fractional Lévy processes with an application to long

memory moving average processes. Bernoulli 12(6), 1099–1126.

[18]

Marquardt, T. (2007). Multivariate fractionally integrated CARMA processes.

Journal of Mult. Anal. 98(9), 1705–1725.

[19]

Marquardt, T. and R. Stelzer (2007). Multivariate CARMA processes. Stochastic

Process. Appl. 117(1), 96–120. doi: 10.1016/j.spa.2006.05.014.

[20]

Mohammed, S.E.A. and M.K.R. Scheutzow (1990). Lyapunov exponents and

stationary solutions for aﬃne stochastic delay equations. Stochastics Stochastics

Rep. 29(2), 259–283.

[21]

Nielsen, M.S. and V.U. Rohde (2017). Recovering the background noise of a

Lévy-driven CARMA process using an SDDE approach. Proceedings ITISE 2017

2, 707–718.

[22]

Rajput, B.S. and J. Rosiński (1989). Spectral representations of inﬁnitely divisi-

ble processes. Probab. Theory Related Fields 82(3), 451–487.

[23]

Samorodnitsky, G. and M.S. Taqqu (1994). Stable Non-Gaussian Random Pro-

cesses. Stochastic Modeling. Stochastic models with inﬁnite variance. New York:

Chapman & Hall.

[24]

Sato, K., T. Watanabe and M. Yamazato (1994). Recurrence conditions for

multidimensional processes of Ornstein–Uhlenbeck type. J. Math. Soc. Japan

46(2), 245–265.

[25]

Sato, K. and M. Yamazato (1983). “Stationary processes of Ornstein–Uhlenbeck

type”. Probability theory and mathematical statistics (Tbilisi, 1982). Vol. 1021.

Lecture Notes in Math. Springer, Berlin, 541–551. doi: 10.1007/BFb0072949.

[26]

Stelzer, R. (2011). CARMA Processes driven by Non-Gaussian Noise. arXiv:

1201.0155.

[27]

Todorov, V. (2009). Estimation of continuous-time stochastic volatility models

with jumps using high-frequency data. J. Econometrics 148(2), 131–148.

106

P a p e r

Stochastic Diﬀerential Equations with a

Fractionally Filtered Delay: A Semimartingale

Model for Long-Range Dependent Processes

Richard A. Davis, Mikkel Slot Nielsen and Victor Rohde

Abstract

In this paper we introduce a model, the stochastic fractional delay diﬀerential equa-

tion (SFDDE), which is based on the linear stochastic delay diﬀerential equation

and produces stationary processes with hyperbolically decaying autocovariance

functions. The model departs from the usual way of incorporating this type of

long-range dependence into a short-memory model as it is obtained by applying

a fractional ﬁlter to the drift term rather than to the noise term. The advantages

of this approach are that the corresponding long-range dependent solutions are

semimartingales and the local behavior of the sample paths is unaﬀected by the

degree of long memory. We prove existence and uniqueness of solutions to the

SFDDEs and study their spectral densities and autocovariance functions. More-

over, we deﬁne a subclass of SFDDEs which we study in detail and relate to the

well-known fractionally integrated CARMA processes. Finally, we consider the

task of simulating from the deﬁning SFDDEs.

MSC: 60G22; 60H10; 60H20; 60G17; 60H05

Keywords: Long-range dependence; Moving average processes; Semimartingales; Stochastic

diﬀerential equations

1 Introduction

Models for time series producing slowly decaying autocorrelation functions (ACFs)

have been of interest for more than 50 years. Such models were motivated by the

empirical ﬁndings of Hurst in the 1950s that were related to the levels of the Nile

River. Later, in the 1960s, Benoit Mandelbrot referred to a slowly decaying ACF as

107

Paper E

Stochastic diﬀerential equations with a fractionally ﬁltered delay: a semimartingale

model for long-range dependent processes

the Joseph eﬀect or long-range dependence. Since then, a vast amount of literature

on theoretical results and applications have been developed. We refer to [6, 12, 25,

28, 29] and references therein for further background.

A very popular discrete-time model for long-range dependence is the autoregres-

sive fractionally integrated moving average (ARFIMA) process, introduced by Granger

and Joyeux [14] and Hosking [18], which extends the ARMA process to allow for a

hyperbolically decaying ACF. Let

be the backward shift operator and for

γ > −

deﬁne (1 −B)

by means of the binomial expansion,

(1 −B)

∞

j=0

where

0<k≤j

k−1−γ

. An ARFIMA process (

)

t∈Z

is characterized as the unique

purely non-deterministic process (as deﬁned in [8, p. 189]) satisfying

P (B)(1 −B)

= Q(B)ε

, t ∈ Z, (1.1)

where

and

are real polynomials with no zeroes on

{z ∈ C

|z| ≤

}

, (

)

t∈Z

is an

i.i.d. sequence with

[

] = 0,

[

]

∈

,∞

) and

β ∈

2). The ARFIMA equation

(1.1)

is sometimes represented as an ARMA equation with a fractionally integrated

noise, that is,

P (B)X

= Q(B)(1 −B)

−β

, t ∈ Z. (1.2)

(1.1)

one applies a fractional ﬁlter to (

)

t∈Z

, while in

(1.2)

one applies a fractional

ﬁlter to (

)

t∈Z

. One main feature of the solution to

(1.1)

, equivalently

(1.2)

, is that

the autocovariance function γ

(t) B E[X

] satisﬁes

(t) ∼ct

2β−1

, t → ∞, (1.3)

for some constant c > 0.

A simple example of a continuous-time stationary process which exhibits long-

memory in the sense of

(1.3)

is an Ornstein–Uhlenbeck process (

)

t∈R

driven by a

fractional Lévy process, that is, (X

)

t∈R

is the unique stationary solution to

= −κX

dt + dI

, t ∈ R, (1.4)

where κ > 0 and

Γ (1 + β)

−∞

(t −u)

−(−u)

, t ∈ R, (1.5)

with (

)

t∈R

being a Lévy process which satisﬁes

[

] = 0 and

[

]

∈

,∞

). In

(1.5)

denotes the gamma function and we have used the notation

max{x,

}

for

x ∈ R

. The way to obtain long memory in

(1.4)

is by applying a fractional ﬁlter to

the noise, which is in line with

(1.2)

. To demonstrate the idea of this paper, consider

the equation obtained from

(1.4)

but by applying a fractional ﬁlter to the drift term

instead, i.e.,

−X

= −

Γ (1 −β)

−∞

(t −u)

−β

−(s −u)

−β

du + L

−L

, s < t. (1.6)

108

1 · Introduction

One can write (1.6) compactly as

= −κD

dt + dL

, t ∈ R, (1.7)

with (

)

t∈R

being a suitable fractional derivative process of (

)

t∈R

deﬁned in

Proposition 3.6. The equations

(1.6)

–

(1.7)

are akin to

(1.1)

. It turns out that a unique

purely non-deterministic process (as deﬁned in

(3.10)

) satisfying

(1.7)

exists and has

the following properties:

(i)

The memory is long and controlled by

in the sense that

(

)

∼ ct

2β−1

t → ∞ for some c > 0.

(ii)

The

(

) Hölder continuity of the sample paths is not aﬀected by

in the sense

that

(0)

−γ

(

)

∼ ct

t ↓

0 for some

c >

0 (the notion of Hölder continuity in

(

) is indeed closely related to the behavior of the ACF at zero; see Remark

3.9 for a precise relation).

(iii) (X

)

t∈R

is a semimartingale.

While both processes in

(1.4)

and

(1.7)

exhibit long memory in the sense of (i), one

should keep in mind that models for long-memory processes obtained by applying a

fractional ﬁlter to the noise will generally not meet (ii)–(iii), since they inherit various

properties from the fractional Lévy process (

)

t∈R

rather than from the underlying

Lévy process (

)

t∈R

. In particular, this observation applies to the fractional Ornstein–

Uhlenbeck process

(1.4)

which is known not to possess the semimartingale property

for many choices of (

)

t∈R

, and for which it holds that

(0)

−γ

(

)

∼ ct

2β+1

t ↓

for some

c >

0 (see [21, Theorem 4.7] and [1, Proposition 2.5]). The latter property,

the behavior of

near 0, implies an increased

(

) Hölder continuity relative to

(1.7). See Example 4.4 for details about the models (1.4) and (1.7).

The properties (ii)–(iii) may be desirable to retain in many modeling scenarios.

For instance, if a stochastic process (

)

t∈R

is used to model a ﬁnancial asset, the semi-

martingale property is necessary to accommodate the No Free Lunch with Vanishing

Risk condition according to the (First) Fundamental Theorem of Asset Pricing, see

[10, Theorem 7.2]. Moreover, if (

)

t∈R

is supposed to serve as a “good” integrator, it

follows by the Bichteler–Dellacherie Theorem ([7, Theorem 7.6]) that (

)

t∈R

must

be a semimartingale. Also, the papers [4, 5] ﬁnd evidence that the sample paths of

electricity spot prices and intraday volatility of the E-mini S&P500 futures contract

are rough, and Jusselin and Rosenbaum [19] show that the no-arbitrage assumption

implies that the volatility of the macroscopic price process is rough. These ﬁndings

suggest less smooth sample paths than what is induced by models such as the frac-

tional Ornstein–Uhlenbeck process

(1.4)

. In particular, the local smoothness of the

sample paths should not be connected to the strength of long memory.

Several extensions to the fractional Ornstein–Uhlenbeck process

(1.4)

exist. For

example, it is worth mentioning that the class of fractionally integrated continuous-time

autoregressive moving average (FICARMA) processes were introduced in Brockwell

and Marquardt [9], where it is assumed that

and

are real polynomials with

deg

(

)

> deg

(

) which have no zeroes on

{z ∈ C

(

)

≥

}

. The FICARMA process

associated to P and Q is then deﬁned as the moving average process

−∞

g(t −u) dI

, t ∈ R, (1.8)

109

Paper E

Stochastic diﬀerential equations with a fractionally ﬁltered delay: a semimartingale

model for long-range dependent processes

with g : R → R being the L

function characterized by

F [g](y) B

−iyu

g(u) du =

Q(iy)

P (iy)

, y ∈ R.

In line with

(1.2)

for the ARFIMA process, a common way of viewing a FICARMA

process is that it is obtained by applying a CARMA ﬁlter to fractional noise, that is,

)

t∈R

given by (1.8) is the solution to the formal equation

P (D)X

= Q(D)DI

, t ∈ R.

(See, e.g., [21].) Another class, related to the FICARMA process, consists of solutions

(

)

t∈R

to fractional stochastic delay diﬀerential equations (SDDEs), that is, (

)

t∈R

the unique stationary solution to

[0,∞)

t−u

η(du) dt + dI

, t ∈ R, (1.9)

for a suitable ﬁnite signed measure

. See [2, 22] for details about fractional SDDEs.

Note that the fractional Ornstein–Uhlenbeck process

(1.4)

is a FICARMA process

with polynomials

(

) =

and

(

) = 1 and a fractional SDDE with

−κδ

being the Dirac measure at zero.

The model we present includes

(1.6)

and extends this process in the same way as

the fractional SDDE

(1.9)

extends the fractional Ornstein–Uhlenbeck

(1.4)

. Speciﬁ-

cally, we will be interested in a stationary process (X

)

t∈R

satisfying

−X

−∞



−

(s,t]



(u)

[0,∞)

u−v

η(dv) du + L

−L

(1.10)

almost surely for each s < t, where η is a given ﬁnite signed measure and



−

(s,t]



(u) =

Γ (1 −β)

(t −u)

−β

−(s −u)

−β

, u ∈ R.

We will refer to

(1.10)

as a stochastic fractional delay diﬀerential equation (SFDDE).

Equation (1.10) can be compactly written as

[0,∞)

t−u

η(du) dt + dL

, t ∈ R, (1.11)

with (

)

t∈R

deﬁned in Proposition 3.6. The representation

(1.11)

is, for instance,

convenient in order to argue that solutions are semimartingales.

In Section 3 we show that, for a wide range of measures

, there exists a unique

purely non-deterministic process (

)

t∈R

satisfying the SFDDE

(1.10)

. In addition, we

study the behavior of the autocovariance function and the spectral density of (

)

t∈R

and verify that (i)–(ii) hold. We end Section 3 by providing an explicit (prediction)

formula for computing

[

| X

, u ≤ s

]. In Section 4 we focus on delay measures

exponential type, that is,

η(dt) = −κδ

(dt) + f (t) dt, (1.12)

where

(

) =

[0,∞)

(

)

with

= [1

,... ,

∈ R

b ∈ R

and

n ×n

matrix

with a spectrum contained in

{z ∈ C

(

)

}

. Besides relating this subclass to

110

2 · Preliminaries

the FICARMA processes we study two special cases of

(1.12)

in detail, namely the

Ornstein–Uhlenbeck type presented in (1.7) and

∞

t−u

f (u) du dt + dL

, t ∈ R. (1.13)

Equation

(1.13)

is interesting to study as it collapses to an ordinary SDDE (cf. Pro-

postion 4.2), and hence constitutes an example of a long-range dependent solution to

equation

(1.9)

with

−I

replaced by

−L

. While

(1.13)

falls into the overall

setup of [3], the results obtained in that paper do, however, not apply. Finally, based

on the two examples

(1.6)

and

(1.13)

, we investigate some numerical aspects in Sec-

tion 5, including the task of simulating (

)

t∈R

from the deﬁning equation. Section 6

contains the proofs of all the results presented in Sections 3 and 4. We start with

a preliminary section which recalls a few deﬁnitions and results that will be used

repeatedly.

2 Preliminaries

For a measure

on the Borel

-ﬁeld

(

) on

, let

(

) denote the

space relative

. If

is the Lebesgue measure we suppress the dependence on

and write

instead of

(

). By a ﬁnite signed measure we refer to a set function

µ: B

(

)

→ R

the form

−µ

−

, where

and

−

are two ﬁnite singular measures. Integration

of a function

with respect to

is deﬁned (in an obvious way) whenever

f ∈ L

(

|µ|

)

where

|µ| B µ

−

. The convolution of two measurable functions

f ,g : R → C

deﬁned as

f ∗g(t) =

f (t −u)g(u) du

whenever f (t − · )g ∈ L

. Similarly, if µ is a ﬁnite signed measure, we set

f ∗µ(t) =

f (t −u)µ(du)

if f (t − · ) ∈L

(|µ|). For such µ, set

D(µ) =

z ∈ C :

−Re(z)u

|µ|(du) < ∞

Then we deﬁne the bilateral Laplace transform L[µ]: D(µ) → C of µ by

L[µ](z) =

−zu

µ(du), z ∈ D(µ),

and the Fourier transform by

[

](

) =

[

](

) for

y ∈ R

. If

f ∈ L

we will write

[

] =

[

(

)

] and

[

] =

[

(

)

]. We also note that

[

]

∈ L

when

f ∈

∩L

and that

can be extended to an isometric isomorphism from

onto

Plancherel’s theorem.

Recall that a Lévy process is the continuous-time analogue to the (discrete-time)

random walk. More precisely, a one-sided Lévy process (

)

t≥0

= 0, is a stochastic

process having stationary independent increments and càdlàg sample paths. From

111

Paper E

Stochastic diﬀerential equations with a fractionally ﬁltered delay: a semimartingale

model for long-range dependent processes

these properties it follows that the distribution of

is inﬁnitely divisible, and the dis-

tribution of (

)

t≥0

is determined from

via the relation

[

iyL

] =

exp{t logE

[

iyL

]

}

for

y ∈ R

and

t ≥

0. The deﬁnition is extended to a two-sided Lévy process (

)

t∈R

taking a one-sided Lévy process (

)

t≥0

together with an independent copy (

)

t≥0

and setting

t ≥

0 and

−L

(−t)−

t <

0. If

[

]

< ∞

[

] = 0 and

f ∈ L

the integral

(

)

is well-deﬁned as an

limit of integrals of step functions,

and the following isometry property holds:



f (u) dL





= E[L

]

f (u)

du.

For more on Lévy processes and integrals with respect to these, see [26, 31]. Finally,

for two functions

f ,g : R → C

and

a ∈

[

−∞,∞

] we write

(

) =

(

)),

(

) =

(

))

and f (t) ∼g(t) as t → a if

lim

t→a

f (t)

g(t)

= 0, limsup

t→a



f (t)

g(t)



< ∞ and lim

t→a

f (t)

g(t)

= 1,

respectively.

3 The stochastic fractional delay diﬀerential equation

Let (

)

t∈R

be a Lévy process with

[

]

< ∞

and

[

] = 0, and let

β ∈

2).

Without loss of generality we will assume that

[

] = 1. Moreover, denote by

ﬁnite (possibly signed) measure on [0,∞) with

[0,∞)

t |η|(dt) < ∞ (3.1)

and set



−

(s,t]



(u) =

Γ (1 −β)

(t −u)

−β

−(s −u)

−β

, u ∈ R. (3.2)

(In line with [12] we write

−

(s,t]

rather than

(s,t]

(3.2)

to emphasize that it is

the right-sided version of the Riemann–Liouville fractional derivative of

(s,t]

.) Then

we will say that a process (

)

t∈R

with

[

]

< ∞

is a solution to the corresponding

SFDDE if it is stationary and satisﬁes

−X

−∞



−

(s,t]



(u)

[0,∞)

u−v

η(dv) du + L

−L

(3.3)

almost surely for each

s < t

. Note that equation

(3.3)

is indeed well-deﬁned, since

ﬁnite, (

)

t∈R

is bounded in

(

) and

−

(s,t]

∈ L

. As noted in the introduction, we

will often write (3.3) shortly as

[0,∞)

t−u

η(du) dt + dL

, t ∈ R, (3.4)

where (

)

t∈R

is a suitable fractional derivative of (

)

t∈R

(deﬁned in Proposi-

tion 3.6).

112

3 · The stochastic fractional delay diﬀerential equation

In order to study which choices of

lead to a stationary solution to

(3.3)

introduce the function h = h

β,η

: {z ∈ C : Re(z) ≥ 0} → C given by

h(z) = z

1−β

−

[0,∞)

−zu

η(du), Re(z) ≥0. (3.5)

Here, and in the following, we deﬁne

iγθ

using the polar representation

iθ

for

r >

0 and

θ ∈

(

−π, π

]. This deﬁnition corresponds to

γ logz

, using

the principal branch of the complex logarithm, and hence

z 7→ z

is analytic on

C \{z ∈ R : z ≤ 0}. In particular, this means that h is analytic on {z ∈ C : Re(z) > 0}.

Proposition 3.1.

Suppose that

(

) deﬁned in

(3.5)

is non-zero for every

z ∈ C

with

(

)

≥

0. Then there exists a unique

g : R → R

, which belongs to

for (1

−β

)

−1

< γ ≤

and is vanishing on (−∞,0), such that

F [g](y) =

(iy)

−β

h(iy)

, y ∈ R. (3.6)

Moreover, the following statements hold:

(i) For t > 0 the Marchaud fractional derivative D

g(t) at t of g given by

g(t) =

Γ (1 −β)

lim

δ↓0

∞

g(t) −g(t −u)

1+β

du (3.7)

exists, D

g ∈L

∩L

and F [D

g](y) = 1/h(iy) for y ∈ R.

(ii) The function g is the Riemann–Liouville fractional integral of D

g, that is,

g(t) =

Γ (β)

g(u)(t −u)

β−1

du, t > 0.

(iii) The function g satisﬁes

g(t) = 1+





∗η(u) du, t ≥ 0, (3.8)

and for v ∈ R and with D

−

(s,t]

given in (3.2),

g(t −v) −g(s −v) =

−∞



−

(s,t]



(u)g ∗η(u −v) du + 1

(s,t]

(v). (3.9)

Before formulating our main result, Theorem 3.2, recall that a stationary process

)

t∈R

with E[X

] < ∞ and E[X

] = 0 is said to be purely non-deterministic if

t∈R

sp{X

: s ≤ t} = {0}, (3.10)

see [1, Section 4]. Here sp denotes the L

(P)-closure of the linear span.

Theorem 3.2.

Suppose that

(

) deﬁned in

(3.5)

is non-zero for every

z ∈ C

with

(

)

≥

and let g be the function introduced in Proposition 3.1. Then the process

−∞

g(t −u) dL

, t ∈ R, (3.11)

is well-deﬁned, centered and square integrable, and it is the unique purely non-deterministic

solution to the SFDDE (3.3).

113

Paper E

Stochastic diﬀerential equations with a fractionally ﬁltered delay: a semimartingale

model for long-range dependent processes

Remark 3.3.

Note that we cannot hope to get a uniqueness result without imposing

a condition such as (3.10). For instance, the fact that

−∞

(t −u)

−β

−(s −u)

−β

du = 0,

shows together with

(3.3)

that (

)

t∈R

is a solution for any

U ∈ L

(

) as long as

(

)

t∈R

is a solution. Moreover, uniqueness relative to condition

(3.10)

is similar to

that of discrete-time ARFIMA processes, see [8, Theorem 13.2.1].

Remark 3.4.

It is possible to generalize

(3.3)

and Theorem 3.2 to allow for a heavy-

tailed distribution of the noise. Speciﬁcally, suppose that (

)

t∈R

is a symmetric

α-stable Lévy process for some α ∈ (1,2), that is, (L

)

t∈R

is a Lévy process and

iyL

= e

−σ

|y|

, y ∈ R,

for some

σ >

0. To deﬁne the process (

)

t∈R

(3.11)

it is necessary and suﬃcient

that

g ∈L

, which is indeed the case if

β ∈

−

/α

) by Proposition 3.1. From this

point, using

(3.9)

, we only need a stochastic Fubini result (which can be found in

[1, Theorem 3.1]) to verify that

(3.3)

is satisﬁed. One will need another notion (and

proof) of uniqueness, however, as our approach relies on

theory. For more on stable

distributions and corresponding deﬁnitions and results, we refer to [30].

Remark 3.5.

The process

(3.11)

and other well-known long-memory processes do

naturally share parts of their construction. For instance, they are typically viewed

as “borderline” stationary solutions to certain equations. To be more concrete, the

ARFIMA process can be viewed as an ARMA process, but where the autoregressive

polynomial

is replaced by

z 7→ P

(

)(1

−z

)

. Although an ordinary ARMA process

exists if and only if

is non-zero on the unit circle (and, in the positive case, will be

a short memory process), the autoregressive function

of the ARFIMA model will

always have a root at

= 1. The analogue to the autoregressive polynomial in the

non-fractional SDDE model (that is, (3.3) with D

−

(s,t]

replaced by 1

(s,t]

) is

z 7−→z −L[η](z), (3.12)

where the critical region is on the imaginary axis

{iy

y ∈ R}

rather than on the

unit circle

{z ∈ C

|z|

= 1

}

(see [2]). The SFDDE corresponds to replacing

(3.12)

z 7→ z −z

[

](

), which will always have a root at

= 0. However, to ensure existence

both in the ARFIMA model and in the SFDDE model, assumptions are made such

that these roots will be the only ones in the critical region and their order will be

For a treatment of ARFIMA processes, we refer to [8, Section 13.2].

The solution (

)

t∈R

of Theorem 3.2 is causal in the sense that

only depends on

past increments of the noise

−L

s ≤ t

. An inspection of the proof of Theorem 3.2

reveals that one only needs to require that

(

)

0 for all

y ∈ R

for a (possibly

non-causal) stationary solution to exist. The diﬀerence between the condition that

(

) is non-zero when

(

) = 0 rather than when

(

)

≥

0 in terms of causality is

similar to that of non-fractional SDDEs (see, e.g., [2]).

The next result shows why one may view

(3.3)

(3.4)

. In particular, it reveals

that the corresponding solution (

)

t∈R

is a semimartingale with respect to (the

114

3 · The stochastic fractional delay diﬀerential equation

completion of) its own ﬁltration or equivalently, in light of

(3.3)

and

(3.11)

, the one

generated from the increments of (L

)

t∈R

Proposition 3.6.

Suppose that

(

) is non-zero for every

z ∈ C

with

(

)

≥

0 and let

)

t∈R

be the solution to (3.3) given in Theorem 3.2. Then, for t ∈ R, the limit

Γ (1 −β)

lim

δ↓0

∞

−X

t−u

1+β

du (3.13)

exists in L

(P), D

−∞

g(t −u) dL

, and it holds that

Γ (1 −β)

−∞

(t −u)

−β

−(s −u)

−β

[0,∞)

u−v

η(dv) du

[0,∞)

u−v

η(dv) du

(3.14)

almost surely for each s < t.

We will now provide some properties of the solution (

)

t∈R

(3.3)

given in

(3.11)

Since the autocovariance function γ

takes the form

(t) =

g(t + u)g(u) du, t ∈ R, (3.15)

it follows by Plancherel’s theorem that (

)

t∈R

admits a spectral density

which is

given by

(y) = |F [g](y)|

|h(iy)|

|y|

−2β

, y ∈ R. (3.16)

(See the appendix for a brief recap of the spectral theory.) The following result

concerning

and

shows that solutions to

(3.3)

exhibit a long-memory behavior

and that the degree of memory can be controlled by β.

Proposition 3.7.

Suppose that

(

) is non-zero for every

z ∈ C

with

(

)

≥

0 and let

and f

be the functions introduced in (3.15)–(3.16). Then it holds that

(t) ∼

Γ (1 −2β)

Γ (β)Γ (1 −β)η([0,∞))

2β−1

as t → ∞

and f

(y) ∼

η([0,∞))

|y|

−2β

as y → 0.

In particular,

|γ

(t)| dt = ∞.

While the behavior of

(

) as

t → ∞

is controlled by

, the content of Proposition 3.8

is that the behavior of

(

) as

t →

0, and thus the

(

) Hölder continuity of the

sample paths of (X

)

t∈R

(cf. Remark 3.9), is unaﬀected by β.

Proposition 3.8.

Suppose that

(

) is non-zero for every

z ∈ C

with

(

)

≥

0, let (

)

t∈R

be the solution to

(3.3)

and denote by

its ACF. Then it holds that 1

−ρ

(

)

∼ h

h ↓

115

Paper E

Stochastic diﬀerential equations with a fractionally ﬁltered delay: a semimartingale

model for long-range dependent processes

Remark 3.9.

Recall that for a given

γ >

0, a centered and square integrable process

(

)

t∈R

with stationary increments is said to be locally

-Hölder continuous in

(

)

if there exists a constant C > 0 such that

E[(X

−X

)

]

2γ

≤ C

for all suﬃciently small t > 0. By deﬁning the semi-variogram

(t) B

E[(X

−X

)

], t ∈ R,

we see that (

)

t∈R

is locally

-Hölder continuous if and only if

(

) =

(

2γ

) as

t →

0. When (

)

t∈R

is stationary we have the relation

(0)(1

−ρ

), from which

it follows that the

(

) notion of Hölder continuity can be characterized in terms of

the behavior of the ACF at zero. In particular, Proposition 3.8 shows that the solution

(

)

t∈R

(3.3)

is locally

-Hölder continuous if and only if

γ ≤

2. The behavior of

the ACF at zero has been used as a measure of roughness of the sample paths in for

example [4, 5].

Remark 3.10.

As a ﬁnal comment on the path properties of the solution (

)

t∈R

(3.3), observe that

−X

[0,∞)

u−v

η(dv) du + L

−L

for each

s < t

almost surely by Proposition 3.6. This shows that (

)

t∈R

can be chosen

so that it has jumps at the same time (and of the same size) as (

)

t∈R

. This is in

contrast to models driven by a fractional Lévy process, such as

(1.9)

, since (

)

t∈R

is continuous in t (see [21, Theorem 3.4]).

We end this section by providing a formula for computing

[

| X

, u ≤ s

] for any

s < t

. One should compare its form to those obtained for other fractional models (such

as the one in [3, Theorem 3.2] where, as opposed to Proposition 3.11, the prediction

is expressed not only in terms of its own past, but also the past noise).

Proposition 3.11.

Suppose that

(

) is non-zero for every

z ∈ C

with

(

)

≥

0 and let

)

t∈R

denote the solution to (3.3). Then for any s < t, it holds that

E[X

| X

, u ≤ s] = g(t −s)X

[0,t−s)

−∞

[0,∞)



−

(s,t−u]



(v + w)η(dv) dwg(du),

where g(du) = δ

(du) + (D

g) ∗η(u) du is the Lebesgue–Stieltjes measure induced by g.

4 Delays of exponential type

Let

be an

n ×n

matrix where all its eigenvalues belong to

{z ∈ C

(

)

}

, and let

b ∈ R

and κ ∈ R. In this section we restrict our attention to measures η of the form

η(dt) = −κδ

(dt) + f (t) dt with f (t) = 1

[0,∞)

(t)b

, (4.1)

116

4 · Delays of exponential type

where

,... ,

∈ R

. Note that

is used as a normalization; the eﬀect of

replacing

by any

c ∈ R

can be incorporated in the choice of

and

. It is well-

known that the assumption on the eigenvalues of

imply that all the entries of

decay exponentially fast as

u → ∞

, so that

is a ﬁnite measure on [0

,∞

) with

moments of any order. Since the Fourier transform F [f ] of f is given by

F [f ](y) = b

iy −A)

−1

, y ∈ R,

it admits a fraction decomposition; that is, there exist real polynomials

Q, R

C → C

being monic with the eigenvalues of

as its roots and being of larger degree than

R, such that

F [f ](y) = −

R(iy)

Q(iy)

(4.2)

for

y ∈ R

. (This is a direct consequence of the inversion formula

−1

adj

(

)

/ det

(

).)

By assuming that

and

have no common roots, the pair (

Q, R

) is unique. The

following existence and uniqueness result is simply an application of Theorem 3.2 to

the particular setup in question:

Corollary 4.1. Let Q and R be given as in (4.2). Suppose that κ + b

−1

, 0 and

Q(z)[z + κz

] + R(z)z

, 0 (4.3)

for all

z ∈ C \{

}

with

(

)

≥

0. Then there exists a unique purely non-deterministic

solution (

)

t∈R

(3.3)

with

given by

(4.1)

and it is given by

(3.11)

with

g : R → R

characterized through the relation

F [g](y) =

Q(iy)

Q(iy)[iy + κ(iy)

] + R(iy)(iy)

, y ∈ R. (4.4)

Before giving examples we state Proposition 4.2, which shows that the general

SFDDE (3.3) can be written as

= −κD

dt +

∞

t−u

f (u) du dt + dL

, t ∈ R, (4.5)

when

is of the form

(4.1)

. In case

= 0,

(4.5)

is a (non-fractional) SDDE. However,

the usual existence results obtained in this setting (for instance, those in [2] and [17])

are not applicable, since the delay measure

(

)

has unbounded support and

zero total mass

∞

f (u) du = 0.

Proposition 4.2.

Let

be of the form

(4.1)

. Then

f : R → R

deﬁned by

(

) = 0

for t ≤ 0 and

f (t) =

Γ (1 −β)



−Au

−β

du + t

−β



for

t >

0 belongs to

∩L

. If in addition

(4.3)

holds,

−1

0 and (

)

t∈R

is the

solution given in Corollary 4.1, then

∞

t−u

f (u) du =

∞

t−u

f (u) du

almost surely for any t ∈ R.

117

Paper E

Stochastic diﬀerential equations with a fractionally ﬁltered delay: a semimartingale

model for long-range dependent processes

Remark 4.3.

Due to the structure of the function

(4.4)

one may, in line with the

interpretation of CARMA processes, think of the corresponding solution (

)

t∈R

as a

stationary process that satisﬁes the formal equation



Q(D)[D + κD

] + R(D)D



= Q(D)DL

, t ∈ R, (4.6)

where

denotes diﬀerentiation with respect to

and

is a suitable fractional

derivative. Indeed, by heuristically applying the Fourier transform

(4.6)

and us-

ing computation rules such as

[

](

) =

iyF

[

](

) and

[

](

) = (

)

[

](

one ends up concluding that (

)

t∈R

is of the form

(3.11)

with

characterized by

(4.4)

For two monic polynomials

and

with

q B deg

(

) =

deg

(

)

−

1 and all their roots

contained in

{z ∈ C

(

)

}

, consider the FICARMA(

+ 1

,β,q

) process (

)

t∈R

Heuristically, by applying

as above, (

)

t∈R

may be thought of as the solution to

(

)

(

)

t ∈ R

. By choosing the polynomial

and the constant

such

that

(

) =

(

)[

] +

(

) we can think of (

)

t∈R

as the solution to the formal

equation



Q(D)[D

1+β

+ κD

] + R(D)D



= Q(D)DL

, t ∈ R. (4.7)

It follows that

(4.6)

and

(4.7)

are closely related, the only diﬀerence being that

κD

is replaced by

1+β

κD

. In particular, one may view solutions to SFDDEs

corresponding to measures of the form

(4.1)

as being of the same type as FICARMA

processes. While the considerations above apply only to the case where

deg

(

) =

+1,

it should be possible to extend the SFDDE framework so that solutions are comparable

to the FICARMA processes in the general case

deg

(

)

> q

by following the lines of [3],

where similar theory is developed for the SDDE setting.

We will now give two examples of (4.5).

Example 4.4. Consider choosing η = −κδ

for some κ > 0 so that (3.3) becomes

−X

= −

Γ (1 −β)

−∞

(t −u)

−β

−(s −u)

−β

du + L

−L

, s < t, (4.8)

or, in short,

= −κD

dt + dL

, t ∈ R. (4.9)

To argue that a unique purely non-deterministic solution exists, we observe that

(

) = 1 and

(

) = 0 for all

z ∈ C

. Thus, in light of Corollary 4.1 and

(4.3)

, it suﬃces

to argue that

κz

0 for all

z ∈ C \{

}

with

(

)

≥

0. By writing such

iθ

for a suitable r > 0 and θ ∈ [−π/2,π/2], the condition may be written as



r cos(θ) + κr

cos(βθ)



+ i



r sin(θ) + κr

sin(βθ)



, 0. (4.10)

If the imaginary part of the left-hand side of

(4.10)

is zero it must be the case that

= 0, since

κ >

0 while

sin

(

) and

sin

(

βθ

) are of the same sign. However, if

= 0,

the real part of the left-hand side of

(4.10)

κr

0. Consequently, Corollary 4.1

implies that a solution to

(4.9)

is characterized by

(3.11)

and

[

](

) = ((

)

−1

for y ∈ R. In particular, γ

takes the form

(t) =

ity

+ 2κ sin(

βπ

)|y|

1+β

+ κ

|y|

2β

dy, t ∈ R. (4.11)

118

4 · Delays of exponential type

In Figure 1 we have plotted the ACF of (

)

t∈R

using

(4.11)

with

= 1 and

β ∈

{

}

. We compare it to the ACF of the corresponding fractional Ornstein–

Uhlenbeck process (equivalently, the FICARMA(1

,β,

0) process) which was presented

in (1.4). To do so, we use that its autocovariance function γ

is given by

(t) =

ity

|y|

2(1+β)

+ κ

|y|

2β

dy, t ∈ R. (4.12)

From these plots it becomes evident that, although the ACFs share the same behavior

at inﬁnity, they behave diﬀerently near zero. In particular, we see that the ACF of

(

)

t∈R

decays more rapidly around zero, which is in line with Proposition 3.8 and the

fact that the

(

) Hölder continuity of the fractional Ornstein–Uhlenbeck process

increases as β increases (cf. the introduction).

0 5 10 15 20 25

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 5 10 15 20 25 30 35 40 45 50

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 5 10 15 20 25

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 5 10 15 20 25 30 35 40 45 50

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Figure 1:

The left plot is the ACF based on

(4.11)

with

= 0

1 (yellow),

= 0

2 (green),

= 0

3 (black) and

= 0

4 (blue). With

= 0

4 ﬁxed, the plot on the right compares the ACF based on

(4.11)

with

= 1 (blue)

to the ACF based on

(4.12)

for

= 0

125

2 (red) where the ACF decreases in

, in particular, the

top curve corresponds to κ = 0.125 and the bottom to κ = 2.

Example 4.5.

Suppose that

is given by

(4.1)

with

= 0,

−κ

and

−κ

for

some κ

,κ

> 0. In this case, f (t) = −κ

−κ

and (4.5) becomes

Γ (1 −β)

∞

t−u



−κ

−β

dv −u

−β



du dt + dL

, t ∈ R, (4.13)

and since Q(z) = z + κ

and R(z) = κ

we have that

zQ(z) + R(z)z

= z

+ κ

z + κ

To verify (4.3), set z = x + iy for x > 0 and y ∈ R and note that

+ κ

z + κ



−y

+ κ

x + κ

cos(βθ

)|z|



+ i



y + 2xy + κ

sin(βθ

)|z|



(4.14)

for a suitable

∈

(

−π/

,π/

2). For the imaginary part of

(4.14)

to be zero it must be

the case that

(κ

+ 2x)y = −κ

sin(βθ

)|z|

119

Paper E

Stochastic diﬀerential equations with a fractionally ﬁltered delay: a semimartingale

model for long-range dependent processes

and this can only happen if

= 0, since

x, κ

,κ

0 and the sign of

is the same

as that of

sin

(

βθ

). However, if

= 0 it is easy to see that the real part of

(4.14)

cannot be zero for any

x >

0, so we conclude that

(4.3)

holds and that there exists a

stationary solution (

)

t∈R

given through the kernel

(4.4)

. With

cos

(

βπ/

2) and

= sin(βπ/2) the autocovariance function γ

is given by

(t) =

ity

+ κ

+ 2κ



|y|

1+β

−γ

|y|

2+β



+ κ

|y|

2β

dy, t ∈ R. (4.15)

The polynomials to the associated FICARMA(2

,β,

1) process are given by

(

) =

and

(

) =

(see Remark 4.3) and the autocovariance function

takes the form

(t) =

ity

+ κ

|y|

4+2β

+ (κ

−2κ

)|y|

2+2β

+ κ

|y|

2β

dy, t ∈ R. (4.16)

In Figure 2 we have plotted the ACF based on

(4.15)

for

= 1 and various values of

and

. For comparison we have also plotted the ACF based on

(4.16)

for the same

choices of

and

. From these plots we see that both the ACF corresponding to

(4.15)

and

(4.16)

are decreasing in

, which is similar to the role of

in Example 4.4.

It appears as well that a larger

causes more curvature, although this eﬀect is less

pronounced for (4.15) than for (4.16).

0 10 20 30 40 50

0.2

0.4

0.6

0.8

1.2

0 10 20 30 40 50

-0.2

0.2

0.4

0.6

0.8

0 10 20 30 40 50

-0.4

-0.2

0.2

0.4

0.6

0.8

1.2

0 10 20 30 40 50

0.2

0.4

0.6

0.8

1.2

0 10 20 30 40 50

-0.2

0.2

0.4

0.6

0.8

1.2

0 10 20 30 40 50

-0.4

-0.2

0.2

0.4

0.6

0.8

1.2

Figure 2:

First row is ACF based on

(4.15)

, second row is ACF based on

(4.16)

, and the columns correspond

= 0

= 1 and

= 2, respectively. Within each plot, the lines correspond to

= 0

1 (yellow),

β = 0.2 (green), β = 0.3 (black) and β = 0.4 (blue). In all plots, κ

= 1.

5 Simulation from the SFDDE

In the following we will focus on simulating from

(3.3)

. We begin this simulation

study by considering the Ornstein–Uhlenbeck type equation discussed in Example 4.4

120

5 · Simulation from the SFDDE

with

= 1 and under the assumption that (

)

t∈R

is a standard Brownian motion. Let

100/∆

and

2000/∆

. We generate a simulation of the solution process (

)

t∈R

on a grid of size

∆

= 0

01 and with

3700/∆

steps of size

∆

starting from

−c

and

ending at

1600/∆

. Initially, we set

equal to zero for the ﬁrst

points in the grid

and then discretize (4.8) using the approximation

(n∆ −u)

−β

−((n −1)∆ −u)

−β

1 −β

∆

1−β

(n−1)∆

n−1

k=n−c

k∆

+ X

(k−1)∆

k∆

(k−1)∆

(n∆ −u)

−β

−((n −1)∆ −u)

−β

1 −β

∆

1−β

(n−1)∆

1 −β

n−1

k=n−c

k∆

+ X

(k−1)∆



2((n −k −1)∆)

1−β

−((n −k)∆)

1−β

−((n −k −2)∆)

1−β



for

−c

+ 1

,... ,

3700

/∆ − c

− c

. Next, we disregard the ﬁrst

values of

the simulated sample path to obtain an approximate sample from the stationary

distribution. We assume that the process is observed on a unit grid resulting in

simulated values

,... , X

1600

. This is repeated

200

times, and in every repetition the

sample ACF based on

,... , X

is computed for

= 1

,... ,

25 and

100,400,1600

In long-memory models, the sample mean

can be a poor approximation to the

true mean

[

] even for large

, and this may result in considerable negative (ﬁnite

sample) bias in the sample ACF (see, e.g., [23]). Due to this bias, it may be diﬃcult to

see if we succeed in simulating from

(3.3)

, and hence we will assume that

[

] is

known to be zero when computing the sample ACF. We calculate the

95 %

conﬁdence

interval

ρ(k) −1.96

σ(k)

√

200

ρ(k) + 1.96

σ(k)

√

200

for the mean of the sample ACF based on

observations at lag

. Here

(

) is the

sample mean and

(

) is the sample standard deviations of the ACF at lag

based on

the

200

replications. In Figure 3, the theoretical ACFs and the corresponding

95 %

conﬁdence intervals for the mean of the sample ACFs are plotted for

= 0

2 and

100,400,1600

. We see that, when correcting for the bias induced by an unknown

mean

[

], simulation from equation

(4.8)

results in a fairly unbiased estimator of

the ACF for small values of

. When

β >

25, in the case where the ACF of (

)

t∈R

is not even in

, the results are more unstable as it requires large values of

and

to ensure that the simulation results in a good approximation to the stationary

distribution of (

)

t∈R

. Moreover, even after correcting for the bias induced by an

unknown mean of the observed process, the sample ACF for the ARFIMA process

shows considerable ﬁnite sample bias when

β >

25, see [23], and hence we may

expect this to apply to solutions to (3.3) as well.

In Figure 4 we have plotted box plots for the 200 replications of the sample ACF

for

= 0

2 and

100,400,1600

. We see that the sample ACFs have the expected

convergence when

grows and that the distribution is more concentrated in the case

where less memory is present.

121

Paper E

Stochastic diﬀerential equations with a fractionally ﬁltered delay: a semimartingale

model for long-range dependent processes

0 5 10 15 20 25

0.2

0.4

0.6

0.8

0 5 10 15 20 25

0.2

0.4

0.6

0.8

0 5 10 15 20 25

0.2

0.4

0.6

0.8

0 5 10 15 20 25

0.2

0.4

0.6

0.8

0 5 10 15 20 25

0.2

0.4

0.6

0.8

0 5 10 15 20 25

0.2

0.4

0.6

0.8

Figure 3:

Theoretical ACF and

95 %

conﬁdence intervals of the mean of the sample ACF based on

200

replications of

,.. .,X

. Columns correspond to

100

400

and

1600

, respectively, and rows

correspond to β = 0.1 and β = 0.2, respectively. The model is (4.8).

0 5 10 15 20 25

-0.4

-0.2

0.2

0.4

0.6

0.8

0 5 10 15 20 25

-0.4

-0.2

0.2

0.4

0.6

0.8

0 5 10 15 20 25

-0.4

-0.2

0.2

0.4

0.6

0.8

0 5 10 15 20 25

-0.4

-0.2

0.2

0.4

0.6

0.8

0 5 10 15 20 25

-0.4

-0.2

0.2

0.4

0.6

0.8

0 5 10 15 20 25

-0.4

-0.2

0.2

0.4

0.6

0.8

Figure 4:

Box plots for the sample ACF based on

200

replications of

,.. .,X

together with the theoretical

ACF. Columns correspond to

100

400

and

1600

, respectively, and rows correspond to

= 0

and β = 0.2, respectively. The model is (4.8).

122

6 · Proofs

Following the same approach as above, we simulate the solution to the equation

discussed in Example 4.5. Speciﬁcally, the simulation is based on equation

(3.3)

restricted to the case where

(

) =

−e

−t

and (

)

t∈R

is a standard Brownian motion.

In this case, we use the approximation

(n∆ −u)

−β

−((n −1)∆ −u)

−β

∞

u−v

−v

dv du

∞

n∆−v

(u −∆)

−β

−u

−β

u−v

du dv

∆X

(n−1)∆

f (∆)

k=2

∆



(n−k)∆

+ X

(n−k+1)∆



ϕ(k∆) + ϕ((k −1)∆)



where ϕ : R → R is given by

ϕ(v) =

(u −∆)

−β

−u

−β

u−v

du.

We approximate ϕ recursively by noting that

ϕ(k∆) =

k∆

(u −∆)

−β

−u

−β

u−k∆

1 + e

−∆

k∆

(k−1)∆

(u −∆)

−β

−u

−β

dv + e

−∆

ϕ((k −1)∆)

1 −β

1 + e

−∆

((k −1)∆)

1−β

−(k∆)

1−β

+ e

−∆

ϕ((k −1)∆)

for

k ≥

1. The theoretical ACFs and corresponding

95 %

conﬁdence intervals are

plotted in Figure 5 and the box plots in Figure 6. The ﬁndings are consistent with the

ﬁrst example that we considered in the sense of convergence of the sample ACF and

the eﬀect of memory (the value of β).

6 Proofs

Proof of Proposition 3.1.

For

γ >

0 deﬁne

(

) =

(

) for each

z ∈ C \{

}

with

(

)

≥

0. By continuity of

and the asymptotics

(

)

| ∼ |η

([0

,∞

))

−1

|z|

|z| →

and |h

(z)| ∼ |z|

γ−1

, |z| → ∞, it follows that

sup

x>0

(x + iy)|

dy < ∞ (6.1)

for

γ ∈

(

−

2). In other words,

is a certain Hardy function, and thus there exists

a function

: R → R

which is vanishing on (

−∞,

0) and has

[

](

) =

(

)

when

(

)

0, see [2, 11, 13]. Note that

is indeed real-valued, since

(x −iy)

(

) for

y ∈ R

and a ﬁxed

x >

0. We can apply [24, Proposition 2.3] to deduce

that there exists a function

g ∈ L

satisfying

(3.6)

and that it can be represented as

the (left-sided) Riemann–Liouville fractional integral of f

, that is,

g(t) =

Γ (β)

(u)(t −u)

β−1

du, t > 0.

123

Paper E

Stochastic diﬀerential equations with a fractionally ﬁltered delay: a semimartingale

model for long-range dependent processes

0 5 10 15 20 25

-0.2

0.2

0.4

0.6

0.8

0 5 10 15 20 25

-0.2

0.2

0.4

0.6

0.8

0 5 10 15 20 25

-0.2

0.2

0.4

0.6

0.8

0 5 10 15 20 25

-0.2

0.2

0.4

0.6

0.8

0 5 10 15 20 25

-0.2

0.2

0.4

0.6

0.8

0 5 10 15 20 25

-0.2

0.2

0.4

0.6

0.8

Figure 5:

Theoretical ACF and

95 %

conﬁdence intervals of the mean of the sample ACF sample based

200

replications of

,.. .,X

. Columns correspond to

100

400

and

1600

, respectively, and

rows correspond to β = 0.1 and β = 0.2, respectively. The model is (4.13).

0 5 10 15 20 25

-0.4

-0.2

0.2

0.4

0.6

0.8

0 5 10 15 20 25

-0.4

-0.2

0.2

0.4

0.6

0.8

0 5 10 15 20 25

-0.4

-0.2

0.2

0.4

0.6

0.8

0 5 10 15 20 25

-0.4

-0.2

0.2

0.4

0.6

0.8

0 5 10 15 20 25

-0.4

-0.2

0.2

0.4

0.6

0.8

0 5 10 15 20 25

-0.4

-0.2

0.2

0.4

0.6

0.8

Figure 6:

Box plots for the sample ACF based on

200

replications of

,.. .,X

together with the theoretical

ACF. Columns correspond to

100

400

and

1600

, respectively, and rows correspond to

= 0

and β = 0.2, respectively. The model is (4.13).

124

6 · Proofs

Conversely, [24, Theorem 2.1] ensures that

given by

(3.7)

is a well-deﬁned limit

and that

. In particular, we have shown (ii) and if we can argue that

∈ L

we have shown (i) as well. This follows from the assumption in

(3.1)

, since then we

have that

y 7→ L

[

](

) is diﬀerentiable for any

x ≥

0 (except at 0 when

= 0) and

L[u 7→ uf

(u)](x + iy) = i

L[f

](x + iy)

L[u η(du)](x + iy) + (1 −β)(x + iy)

−β

h(x + iy)

(6.2)

The function

[

u 7→ uf

(

)] is analytic on

{z ∈ C

(

)

}

and from the identity

(6.2)

it is not too diﬃcult to see that it also satisﬁes the Hardy condition

(6.1)

. This

means

u 7→ uf

(

) belongs to

, and hence we have that

belongs to

. Since

the Riemann–Liouville integral of

of order

and

∈ L

∩L

, [3, Proposition 4.3]

implies that g ∈ L

for (1 −β)

−1

< γ ≤ 2.

It is straightforward to verify (3.9) and to obtain the identity





∗η(u − · ) du =



−

(s,t]



(u)g ∗η(u − · ) du

almost everywhere by comparing their Fourier transforms. This establishes the rela-

tion

g(t −v) −g(s −v) =





∗η(u −v) du + 1

(s,t]

(v).

By letting

s → −∞

, and using that

and

are both vanishing on (

−∞,

0), we deduce

that

g(t) = 1

[0,∞)

(t)



1 +





∗η(u) du



for almost all t ∈ R which shows (3.8) and, thus, ﬁnishes the proof. 

Proof of Theorem 3.2.

Since

g ∈L

, according to Proposition 3.1, and

[

]

< ∞

and

E[L

] = 0,

−∞

g(t −u) dL

, t ∈ R,

is a well-deﬁned process (e.g., in the sense of [26]) which is stationary with mean zero

and ﬁnite second moments. By integrating both sides of

(3.9)

with respect to (

)

t∈R

we obtain

−X





−

(s,t]



(u)g ∗η(u −r) du



+ L

−L

By a stochastic Fubini result (e.g., [1, Theorem 3.1]) we can change the order of

integration (twice) and obtain





−

(s,t]



(u)g ∗η(u −r) du





−

(s,t]



(u)X ∗η(u) du.

This shows that (

)

t∈R

is a solution to

(3.3)

. To show uniqueness, note that the spec-

tral process

(with spectral distribution, say,

) of any purely non-deterministic

solution (X

)

t∈R

satisﬁes

F [1

(s,t]

](−y)(iy)

h(iy)Λ

(dy) = L

−L

(6.3)

125

Paper E

Stochastic diﬀerential equations with a fractionally ﬁltered delay: a semimartingale

model for long-range dependent processes

almost surely for all choices of

s < t

. This follows from the results in the supplemen-

tary material on spectral representations (see Section 7). Using the fact that (

)

t∈R

purely non-deterministic,

is absolutely continuous with respect to the Lebesgue

measure, and hence we can extend

(6.3)

from

(s,t]

to any function

f ∈ L

using an

approximation of

with simple functions of the form

j=1

j−1

]

for

∈ C

and t

< t

< ··· < t

. Speciﬁcally, we establish that

F [f ](−y)(iy)

h(iy)Λ

(dy) =

f (u) dL

(6.4)

almost surely for any

f ∈ L

. In particular we may take

(

t − ·

being the

solution kernel characterized in

(3.6)

, so that

[

](

) =

−ity

(

−iy

)

−β

(

−iy

) and

(6.4)

thus implies that X

−∞

g(t −u) dL

, which ends the proof. 

Proof of Proposition 3.6.

We start by arguing that the limit in

(3.13)

exists and is

equal to

−∞

(

t −u

)

. For a given

δ >

0 it follows by a stochastic Fubini result

that

Γ (1 −β)

∞

−X

t−u

1+β

du =

g(t −r) dL

, (6.5)

where

g(t) =

Γ (1 −β)

∞

g(t) −g(t −u)

1+β

du, t > 0,

and

(

) = 0 for

t ≤

0. Suppose for the moment that (

)

t∈R

is a Brownian motion,

so that (

)

t∈R

-Hölder continuous for all

γ ∈

2) by

(3.3)

. Then, almost surely,

u 7→ (X

−X

t−u

)/u

1+β

is in L

and the relation (6.5) thus shows that

g(t −r) −D

g(t −r)

−−→ 0 as δ,δ

→ 0,

which in turn implies that (

)

δ>0

has a limit in

. We also know that this limit

must be

, since

g → D

pointwise as

δ ↓

0 by

(3.7)

. Having established this

convergence, which does not rely on (

)

t∈R

being a Brownian motion, it follows

immediately from

(6.5)

and the isometry property of the integral map

· dL

that

the limit in

(3.13)

exists and that

−∞

(

t−u

)

. To show

(3.14)

we start by

recalling the deﬁnition of

−

(s,t]

(3.2)

and that

[

−

(s,t]

](

) = (

−iy

)

[

(s,t]

](

This identity can be shown by using that the improper integral

∞

±iv

γ−1

is equal

to Γ (γ)e

±iπγ/2

for any γ ∈ (0, 1). Now observe that





−

(s,t]



(u)g ∗η(u − · ) du



(y) = (−iy)

F [1

(s,t]

](y)F [g](−y)F [η](−y)

= F [1

(s,t]

](y)F

h



∗η

(−y)

= F







∗η(u − · ) du



(y),

and hence

(

−

(s,t]

)(

)

g ∗η

(

u − ·

)

(

)

∗η

(

u − ·

)

almost everywhere.

Consequently, using that

−∞

(

t −u

)

and applying a stochastic Fubini

126

6 · Proofs

result twice,





∗η(u) du =





∗η(u −r) du dL



−

(s,t]



(u)g ∗η(u −r) du dL

Γ (1 −β)

(t −u)

−β

−(s −u)

−β

X ∗η(u) du.

The semimartingale property of (

)

t∈R

is now an immediate consequence of

(3.3)



Proof of Proposition 3.7.

Using

(3.16)

and that

(0) =

−η

([0

,∞

)), it follows that

(

)

∼ |y|

−2β

/η

([0

,∞

))

y →

0. To show the asymptotic behavior of

∞

start by recalling that, for u, v ∈ R,

∞

u∨v

(s −u)

β−1

(s −v)

β−1

ds =

Γ (β)Γ (1 −2β)

Γ (1 −β)

|u −v|

2β−1

by [16, p. 404]. Having this relation in mind we use Proposition 3.1(ii) and

(3.15)

do the computations

(t) =

Γ (β)

g(u)D

g(v)(s + t −u)

β−1

(s −v)

β−1

dv du ds

Γ (β)

g(u)D

g(v)

∞

(u−t)∨v

(s −(u −t))

β−1

(s −v)

β−1

ds dv du

Γ (1 −2β)

Γ (β)Γ (1 −β)

g(u)D

g(v)|u −v −t|

2β−1

dv du

Γ (1 −2β)

Γ (β)Γ (1 −β)

γ(u)|u −t|

2β−1

du, (6.6)

where

(

) =

(

)

(

)

. Note that

γ ∈ L

since

g ∈ L

by Proposi-

tion 3.1 and, using Plancherel’s theorem,

γ(u) =

−iuy



F [D

g](y)



dy = F [|h(i · )|

−2

](u).

In particular

(

)

(0)

−2

([0

,∞

))

−2

, and hence it follows from

(6.6)

that

we have shown the result if we can argue that

γ(u)|u −t|

2β−1

γ(u)

−1|

1−2β

du →

γ(u) du as t → ∞. (6.7)

It is clear by Lebesgue’s theorem on dominated convergence that

−∞

γ(u)

−1|

1−2β

du →

−∞

γ(u) du as t → ∞.

Moreover, since

(

i ·

)

−2

is continuous at 0 and diﬀerentiable on (

−∞,

0) and (0

,∞

)

with integrable derivatives, it is absolutely continuous on

with a density

127

Paper E

Stochastic diﬀerential equations with a fractionally ﬁltered delay: a semimartingale

model for long-range dependent processes

As a consequence, γ(u) = F [φ](u)/(iu) and, thus,

∞

t/2

γ(u)

−1|

1−2β

du =

∞

1/2

tγ(tu)

|u −1|

1−2β

du = −i

∞

1/2

F [φ](tu)

u|u −1|

1−2β

du. (6.8)

By the Riemann–Lebesgue lemma and Lebesgue’s theorem on dominated convergence

it follows that the right-hand side of expression in

(6.8)

tends to zero as

tends to

inﬁnity. Finally, integration by parts and the symmetry of γ yields

t/2

γ(u)



1 −

−1|

1−2β



du =

1/2

tγ(tu)



1 −

(1 −u)

1−2β





1−2β

−1



−t/2

−∞

γ(u) du

−

1/2

1 −2β

(1 −u)

2−2β

−tu

−∞

γ(v) dv du,

where both terms on the right-hand side converge to zero as

tends to inﬁnity. Thus,

we have shown (6.7), and this completes the proof. 

Proof of Proposition 3.8.

Observe that it is suﬃcient to argue

[(

−X

)

]

∼ t

t ↓

0. By using the spectral representation

ity

(

) and the isometry prop-

erty of the integral map

· dΛ

: L

) → L

(P), see [15, p. 389], we have that

E[(X

−X

)

]

= t

−2

|1 −e

(y/t) dy

|1 −e

|y|

2β

|(iy)

1−β

−t

1−β

F [η](y/t)|

dy. (6.9)

Consider now a

y ∈ R

satisfying

|y|≥ C

with

|η|

([0

,∞

)))

1/(1−β)

. In this case

|y|

1−β

−|t

1−β

[

](

y/t

)

| ≥

0, and we thus get by the reversed triangle inequality that

|1 −e

|y|

2β

|(iy)

1−β

−t

1−β

F [η](y/t)|

≤ 2

|1 −e

If |y|< C

t, we note that the assumption on the function in (3.5) implies that

B inf

|x|≤C



(ix)

1−β

−F [η](x)



> 0,

which shows that



(iy)

1−β

−t

1−β

F [η](y/t)



≥ t

1−β

≥

1−β

|y|

1−β

This establishes that

|1 −e

|y|

2β



(iy)

1−β

−t

1−β

F [η](y/t)



≤

2(1−β)

|1 −e

Consequently, it follows from

(6.9)

and Lebesgue’s theorem on dominated conver-

gence that

E[(X

−X

)

]

→

|1 −e

dy =

|F [1

(0,1]

](y)|

dy = 1 as t ↓ 0,

which was to be shown. 

128

6 · Proofs

Proof of Proposition 3.11.

We start by arguing that the ﬁrst term on the right-hand

side of the formula is well-deﬁned. In order to do so it suﬃces to argue that



t−s

−∞

[0,∞)





−

(s,t−u]



(v + w)



|η|(dv) dw |g|(du)



≤ E[|X

t−s

[0,∞)

−∞





−

(s,t−u]



(v + w)



dw |η|(dv)|g|(du)

(6.10)

is ﬁnite. This is implied by the facts that

Γ (1 −β)

−∞





−

(s,t−u]



(v + w)



≤

u+s−t

(t −s −u + w)

−β

dw +

−β

−(t −s −u + w)

−β

+ (1 + β)

∞

−1−β

(t −s −u) dw

1 −β



2(t −s −u)

1−β

+ 1 −(t −s −u + 1)

1−β



(1 + β)

(t −s −u)

≤

1 −β

(t −s)

1−β

(1 + β)

(t −s)

for

u ∈

,t −s

] and

(

) is a ﬁnite measure (since

g ∈L

by Proposition 3.1). Now

ﬁx an arbitrary z ∈ C with Re(z) > 0. It follows from (3.3) that

L[X1

(s,∞)

](z) = X

L[1

(s,∞

](z) + L[1

(s,∞)

−L

)](z)

+ L

(s,∞)

[0,∞)



−

(s, · ]



(u + v)η(dv) du

(z).

(6.11)

By noting that (D

−

(s,t]

)(u) = 0 when t ≤ s < u we obtain

(s,∞)

∞

[0,∞)



−

(s, · ]



(u + v)η(dv) du

(z)

Γ (1 −β)



∞

[0,∞)

( · −u −v)

−β

η(dv) du



(z)

= L[1

(s,∞)

X](z)L[η](z)z

β−1

Combining this observation with (6.11) we get the relation



z −z

L[η](z)



L[1

(s,∞)

X](z)

= zX

L[1

(s,∞)

](z) + zL[1

(s,∞)

(L −L

)](z)

+ zL

(s,∞)

−∞

[0,∞)



−

(s, · ]



(u + v)η(dv) du

(z),

129

Paper E

Stochastic diﬀerential equations with a fractionally ﬁltered delay: a semimartingale

model for long-range dependent processes

which implies

L[1

(s,∞)

X](z)

= L[g](z)L[X

(s − · )](z) + zL[g](z)L[1

(s,∞)

(L −L

)](z)

+ zL[g](z)L

(s,∞)

−∞

[0,∞)



−

(s, · ]



(u + v)η(dv) du

(z)

= L[g( · −s)X

](z) + L



g( · −u)dL



(z)

+ L



· −s

−∞

[0,∞)



−

(s, · −u]



(v + w)η(dv) dwg(du)



(z).

This establishes the identity

= g(t −s)X

g(t −u) dL

t−s

−∞

[0,∞)



−

(s,t−u]



(v + w)η(dv) dwg(du)

(6.12)

almost surely for Lebesgue almost all

t > s

. Since both sides of

(6.12)

are continuous

(

), the identity holds for each ﬁxed pair

s < t

almost surely as well. By applying

the conditional mean E[ · | X

, u ≤ s] on both sides of (6.12) we obtain the result. 

Proof of Corollary 4.1.

In this setup it follows that the function

(3.5)

is given by

h(z) = z

1−β

+ κ +

R(z)

Q(z)

where

(

)

0 whenever

(

)

≥

0 by the assumption on

. This shows that

non-zero (on {z ∈ C : Re(z) ≥ 0}) if and only if

Q(z)[z

1−β

+ κ] + R(z) , 0 for all z ∈ C with Re(z) ≥ 0. (6.13)

Condition

(6.13)

may equivalently be formulated as

(

)[

κz

] +

(

)

0 for all

z ∈ C \{

}

with

(

)

≥

0 and

(0) =

−1

0, which by Theorem 3.2 shows

that a unique solution to

(4.5)

exists. It also provides the form of the solution, namely

(3.11) with

F [g](y) =

(iy)

−β

(iy)

1−β

+ κ +

R(iy)

Q(iy)

Q(iy)[iy + κ(iy)

] + R(iy)(iy)

, y ∈ R.

This ﬁnishes the proof. 

Proof of Proposition 4.2.

We will ﬁrst show that

f ∈ L

. By using that

∞

−A

−1

we can rewrite D

f as

f (t) =

Γ (1 −β)



(t −u)

−β

−t

−β

du −

∞

−β



, t > 0,

from which we see that it suﬃces to argue that (each entry of)

t 7−→

(t −u)

−β

−t

−β

130

7 · Supplement

belongs to

. Since

u 7→ e

is continuous and with all entries decaying exponentially

fast as u → ∞, this follows from the fact that, for a given γ > 0,

∞

−γu



(t −u)

−β

−t

−β



du dt

≤

∞

−γu



u+1

(t −u)

−β

+ t

−β

dt + βu

∞

−β−1



du < ∞.

Here we have used the mean value theorem to establish the inequality



(t −u)

−β

−t

−β



≤ βu(t −u)

−β−1

for 0

< u < t

. To show that

f ∈ L

, note that it is the left-sided Riemann–Liouville

fractional derivative of f , that is,

f (t) =

Γ (1 −β)

f (t −u)u

−β

du, t > 0.

Consequently, it follows by [27, Theorem 7.1] that the Fourier transform

[

] of

f is given by

F [D

f ](y) = (iy)

F [f ](y) = (iy)

(iy −A)

−1

, y ∈ R,

in particular it belongs to

(e.g., by Cramer’s rule), and thus

f ∈ L

. By comparing

Fourier transforms we establish that (D

g) ∗f = g ∗(D

f ), and hence it holds that

∞

t−u

f (u) du =





∗f (t −r) dL

∞

t−u

f (u) du

using Proposition 3.6 and a stochastic Fubini result. This ﬁnishes the proof. 

7 Supplement to “Stochastic diﬀerential equations with a

fractionally ﬁltered delay: a semimartingale model for

long-range dependent processes”

This supplement provides an exposition of the spectral representation and related

results for continuous-time stationary, measurable, centered and square integrable

processes. The content of the results should be well-known and is mainly provided

for reference.

7.1 Spectral representations of continuous-time stationary processes

In the following we present and prove a few results related to the spectral theory for

continuous-time stationary, measurable, centered and square integrable processes.

Although the results should be well-known, we have not been able to ﬁnd an appro-

priate reference to earlier literature. However, the results presented here rely heavily

on [15, Section 9.4] and [20, Appendix A2.1], in which an extensive treatment of the

spectral theory is given.

Recall that if S = {S(t) : t ∈ R} is a (complex-valued) process such that

(i) E[|S(t)|

] < ∞ for all t ∈ R,

131

Paper E

Stochastic diﬀerential equations with a fractionally ﬁltered delay: a semimartingale

model for long-range dependent processes

(ii) E[|S(t + s) −S(t)|

] → 0 as s ↓ 0 for all t ∈ R, and

(iii) E[(S(v) −S(u))(S(t) −S(s))] = 0 for all u ≤ v ≤s ≤ t,

we may (and do) deﬁne integration of

with respect to

in the sense of [15, pp.

388–390] for any f ∈ L

(G), where G is the control measure characterized by

G((s,t]) = E[|S(t) −S(s)|

], s < t.

We have the following stochastic Fubini result for this type of integral:

Proposition 7.1.

Let

(

) :

t ∈ R}

be a process given as above. Let

be a ﬁnite Borel

measure on

, and let

f : R

→ C

be a measurable function in

(

µ ×G

). Then all the

integrals below are well-deﬁned and



f (x,y)µ(dx)



S(dy) =



f (x,y)S(dy)



µ(dx) (7.1)

almost surely.

Suppose that (

)

t∈R

is a measurable and stationary process with

[

]

< ∞

and

[

] = 0, and denote by

its autocovariance function. Since (

)

t∈R

is continuous

(

) (cf. [1, Corollary A.3]), it follows by Bochner’s theorem that there exists a

ﬁnite Borel measure F

on R such that

(t) =

ity

(dy), t ∈ R.

The measure F

is referred to as the spectral distribution of (X

)

t∈R

Theorem 7.2.

Let (

)

t∈R

be given as above and let

be the associated spectral distribu-

tion. Then there exists a (complex-valued) process

{Λ

(

) :

y ∈ R}

satisfying (i)–(iii)

above with control measure F

, such that

ity

(dy) (7.2)

almost surely for each

t ∈ R

. The process

is called the spectral process of (

)

t∈R

and

(7.2) is referred to as its spectral representation.

Remark 7.3.

Let the situation be as in Theorem 7.2 and note that if there exists

another process

= {

(y) : y ∈ R} such that

ity

(dy), t ∈ R,

then its control measure is necessarily given by F

and

f (y)Λ

(dy) =

f (y)

(dy)

almost surely for all f ∈L

132

7 · Supplement

Proof of Proposition 7.1.

First, note that

(7.1)

is trivially true when

is of the form

f (x,y) =

j=1

(x)1

(y) (7.3)

for

,... , α

∈ C

and Borel sets

,... , A

⊆ R

. Now consider a general

f ∈

(

µ × G

) and choose a sequence of functions (

)

n∈N

of the form

(7.3)

such that

→ f in L

(µ ×G) as n → ∞. Set



(x, y)µ(dx)



S(dy), X =



f (x,y)µ(dx)



S(dy)

and Y =



f (x,y)S(dy)



µ(dx)

Observe that

and

are indeed well-deﬁned, since

x 7→ f

(

x, y

) is in

(

) for

almost all y, y 7→ f (x,y) is in L

(G) for µ-almost all x,



f (x,y)µ(dx)



G(dy) ≤µ(R)

|f (x,y)|

(µ ×G)(dx,dy) < ∞

and E





f (x,y)S(dy)



µ(dx)



|f (x,y)|

(µ ×G)(dx,dy) < ∞.

Next, we ﬁnd that

E[|X −X

] =



(f (x,y) −f

(x, y))µ(dx)



G(dy)

≤ µ(R)

|f (x,y) −f

(x, y)|

(µ ×G)(dx,dy)

which tends to zero by the choice of (

)

n∈N

. Since



(x, y)S(dy)



(

), one

shows in a similar way that

→ Y

(

), and hence we conclude that

almost surely. 

Proof of Theorem 7.2.

For any given

t ∈ R

set

(

) =

ity

y ∈ R

, and let

and

be the set of all (complex) linear combinations of

t ∈ R}

and

t ∈ R}

respectively. By equipping

and

with the usual inner products on

(

) and

(P), their closures H

and H

are Hilbert spaces. Due to the fact that

(P)

= E[X

] =

i(t−s)x

(dy) = hf

)

, s,t ∈ R,

we can deﬁne a linear isometric isomorphism µ: H

→ H

as the one satisfying



j=1



j=1

for any given

n ∈ N

,... , α

∈ C

and

< ··· < t

. Since

(−∞,y]

∈ H

for each

y ∈ R

(cf. [32, p. 150]), we can associate a complex-valued process

{Λ

(

) :

y ∈ R}

)

t∈R

through the relation

(y) = µ(1

(−∞,y]

), y ∈ R.

133

Paper E

Stochastic diﬀerential equations with a fractionally ﬁltered delay: a semimartingale

model for long-range dependent processes

It is straight-forward to check from the isometry property that

is right-continuous

in L

(P), has orthogonal increments and satisﬁes

E[|Λ

) −Λ

] = F

((y

]), y

< y

Consequently, integration with respect to

of any function

f ∈ L

(

) can be

deﬁned in the sense of [15, pp. 388–390]. For any

n ∈ N

,... , α

∈ C

and

< t

···< t

, we have



j=1

j−1

]

(y)



(dy) =

j=1

µ(1

j−1

]

) = µ



j=1

j−1

]



Since

f 7→

(

)

(

) is a continuous map (from

(

) into

(

)), it follows by

approximation with simple functions and from the relation above that

f (y)Λ

(dy) = µ(f )

almost surely for any f ∈ H

. In particular, it shows that

= µ(f

) =

ity

(dy), t ∈ R,

which is the spectral representation of (X

)

t∈R

. 

Acknowledgments

The authors thank Andreas Basse-O’Connor and Jan Pedersen for helpful comments.

The research of Richard Davis was supported in part by ARO MURI grant W911NF–

12–1–0385. The research of Mikkel Slot Nielsen and Victor Rohde was supported by

Danish Council for Independent Research grant DFF–4002–00003.

References

[1]

Barndorﬀ-Nielsen, O.E. and A. Basse-O’Connor (2011). Quasi Ornstein–Uhlen-

beck processes. Bernoulli 17(3), 916–941. doi: 10.3150/10-BEJ311.

[2]

Basse-O’Connor, A., M.S. Nielsen, J. Pedersen and V. Rohde (2019). Stochastic

delay diﬀerential equations and related autoregressive models. Stochastics.

Forthcoming. doi: 10.1080/17442508.2019.1635601.

[3]

Basse-O’Connor, A., M.S. Nielsen, J. Pedersen and V. Rohde (2018). Multivariate

stochastic delay diﬀerential equations and CAR representations of CARMA

processes. Stochastic Process. Appl. Forthcoming. doi:

10.1016/j.spa.2018.11

.011.

[4]

Bennedsen, M. (2015). Rough electricity: a new fractal multi-factor model of

electricity spot prices. CREATES Research Paper 42.

[5]

Bennedsen, M., A. Lunde and M.S. Pakkanen (2016). Decoupling the short-

and long-term behavior of stochastic volatility. arXiv: 1610.00332.

134

References

[6]

Beran, J., Y. Feng, S. Ghosh and R. Kulik (2016). Long-Memory Processes. Springer.

[7]

Bichteler, K. (1981). Stochastic integration and

-theory of semimartingales.

Ann. Probab. 9(1), 49–89.

[8]

Brockwell, P.J. and R.A. Davis (2006). Time series: theory and methods. Springer

Series in Statistics. Reprint of the second (1991) edition. Springer, New York.

[9]

Brockwell, P.J. and T. Marquardt (2005). Lévy-driven and fractionally inte-

grated ARMA processes with continuous time parameter. Statist. Sinica 15(2),

477–494.

[10]

Delbaen, F. and W. Schachermayer (1994). A general version of the fundamental

theorem of asset pricing. Math. Ann. 300(3), 463–520.

[11]

Doetsch, G. (1937). Bedingungen für die Darstellbarkeit einer Funktion als

Laplace-integral und eine Umkehrformel für die Laplace-Transformation. Math.

Z. 42(1), 263–286. doi: 10.1007/BF01160078.

[12]

Doukhan, P., G. Oppenheim and M.S. Taqqu, eds. (2003). Theory and applica-

tions of long-range dependence. Boston, MA: Birkhäuser Boston Inc.

[13]

Dym, H. and H.P McKean (1976). Gaussian processes, function theory, and the

inverse spectral problem. Probability and Mathematical Statistics, Vol. 31. New

York: Academic Press [Harcourt Brace Jovanovich Publishers].

[14]

Granger, C.W. and R. Joyeux (1980). An introduction to long-memory time

series models and fractional diﬀerencing. J. Time Series Anal. 1(1), 15–29.

[15]

Grimmett, G. and D. Stirzaker (2001). Probability and random processes. Oxford

University Press.

[16]

Gripenberg, G. and I. Norros (1996). On the prediction of fractional Brownian

motion. J. Appl. Probab. 33(2), 400–410.

[17]

Gushchin, A.A. and U. Küchler (2000). On stationary solutions of delay dif-

ferential equations driven by a Lévy process. Stochastic Process. Appl. 88(2),

195–211. doi: 10.1016/S0304-4149(99)00126-X.

[18] Hosking, J.R. (1981). Fractional diﬀerencing. Biometrika 68(1), 165–176.

[19]

Jusselin, P. and M. Rosenbaum (2018). No-arbitrage implies power-law market

impact and rough volatility. arXiv: 1805.07134.

[20] Koopmans, L.H. (1995). The spectral analysis of time series. Academic Press.

[21]

Marquardt, T. (2006). Fractional Lévy processes with an application to long

memory moving average processes. Bernoulli 12(6), 1099–1126.

[22]

Mohammed, S.E.A. and M.K.R. Scheutzow (1990). Lyapunov exponents and

stationary solutions for aﬃne stochastic delay equations. Stochastics Stochastics

Rep. 29(2), 259–283.

[23]

Newbold, P. and C. Agiakloglou (1993). Bias in the sample autocorrelations of

fractional noise. Biometrika 80(3), 698–702.

135

Paper E

Stochastic diﬀerential equations with a fractionally ﬁltered delay: a semimartingale

model for long-range dependent processes

[24]

Pipiras, V. and M.S. Taqqu (2003). Fractional calculus and its connections to

fractional Brownian motion. Theory and applications of long-range dependence,

165–201.

[25]

Pipiras, V. and M.S. Taqqu (2017). Long-range dependence and self-similarity.

Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge

University Press.

[26]

Rajput, B.S. and J. Rosiński (1989). Spectral representations of inﬁnitely divisi-

ble processes. Probab. Theory Related Fields 82(3), 451–487.

[27]

Samko, S.G., A.A. Kilbas, O.I. Marichev, et al. (1993). Fractional integrals and

derivatives. Theory and Applications, Gordon and Breach, Yverdon 1993.

[28]

Samorodnitsky, G. (2016). Stochastic processes and long range dependence. Vol. 26.

Springer.

[29]

Samorodnitsky, G. et al. (2007). Long range dependence. Foundations and

Trends® in Stochastic Systems 1(3), 163–257.

[30]

Samorodnitsky, G. and M.S. Taqqu (1994). Stable Non-Gaussian Random Pro-

cesses. Stochastic Modeling. Stochastic models with inﬁnite variance. New York:

Chapman & Hall.

[31]

Sato, K. (1999). Lévy Processes and Inﬁnitely Divisible Distributions. Vol. 68. Cam-

bridge Studies in Advanced Mathematics. Translated from the 1990 Japanese

original, Revised by the author. Cambridge University Press.

[32]

Yaglom, A.M (1987). Correlation theory of stationary and related random functions.

Vol. I. Springer Series in Statistics. Basic results. New York: Springer-Verlag.

136

P a p e r

Limit Theorems for Quadratic Forms and

Related Quantities of Discretely Sampled

Continuous-Time Moving Averages

Mikkel Slot Nielsen and Jan Pedersen

Abstract

The limiting behavior of Toeplitz type quadratic forms of stationary processes has

received much attention through decades, particularly due to its importance in sta-

tistical estimation of the spectrum. In the present paper we study such quantities

in the case where the stationary process is a discretely sampled continuous-time

moving average driven by a Lévy process. We obtain suﬃcient conditions, in terms

of the kernel of the moving average and the coeﬃcients of the quadratic form,

ensuring that the centered and adequately normalized version of the quadratic

form converges weakly to a Gaussian limit.

MSC: 60F05; 60G10; 60G51; 60H05

Keywords: Limit theorems; Lévy processes; Moving averages; Quadratic forms

1 Introduction

Let (

)

t∈Z

be a stationary sequence of random variables with

[

] = 0 and

[

]

∞

, and suppose that (

)

t∈Z

is characterized by a parameter

which we, for simplicity,

assume to be an element of

. If one wants to infer the true value

from a

sample Y (n) = [Y

,... , Y

]

, a typical estimator is obtained as

= argmin

(θ),

where

(

;

(

)) is a suitable objective function. On an informal level, the usual

strategy for showing asymptotic normality of the estimator

is to use a Taylor series

137

Paper F · Limit theorems for quadratic forms and related quantities of discretely sampled

continuous-time moving averages

expansion to write

(θ

)

√

= −

(θ

∗

)

√

−θ

and then show that

(

∗

)

converges in probability to a non-zero constant and

(

)

√

converges in distribution to a centered Gaussian random variable. Here

and

refer to the ﬁrst and second order derivative of

with respect to

, respec-

tively, and

∗

is a point in the interval formed by

and

. While the convergence

(

∗

)

usually can be shown by an ergodic theorem under the assumptions

of consistency of

and ergodicity of (

)

t∈Z

, showing the desired convergence of

(

)

√

may be much more challenging. In particular, if the quantity

(

) corre-

sponds to a rather complicated function of

(

), one often needs to impose restrictive

assumptions on the dependence structure of (

)

t∈Z

, e.g., rapidly decaying mixing

coeﬃcients. In addition to the concern that such type of mixing conditions do not

hold in the presence of long memory, they may generally be diﬃcult to verify.

When

has an explicit form, one can sometimes exploit the particular structure

to prove asymptotic normality of

(

)

√

. To be concrete, let

(

;

) denote the

autocovariance function of (

)

t∈Z

and

(

) = [

(

j −k

;

)]

j,k=1,...,n

the covariance

matrix of

(

). A very popular choice of

is the (scaled) negative Gaussian log-

likelihood,

(θ) = logdet(Σ

(θ)) + Y (n)

(θ)

−1

Y (n). (1.1)

In order to avoid the cumbersome and, in the presence of long memory, unstable

computations related to the inversion of

(

), one sometimes instead uses Whittle’s

approximation of (1.1), which is given by

n,Whittle

(θ) =

2π

−π

log(2πf

(y;θ)) dy + Y (n)

(θ)Y (n)

2π

−π



log(2πf

(y;θ)) dy +

(y)

2πf

(y;θ)



dy,

(1.2)

where f

( · ;θ) is the spectral density of Y , I

is the periodogram of Y and

(θ) =



(2π)

−π

i(j−k)y

(y;θ)



j,k=1,...,n

(For details about the relation between the Gaussian likelihood and Whittle’s approx-

imation, and for some justiﬁcation for their use, see [4, 16, 22].) An important feature

of both

(1.1)

and

(1.2)

is that, under suitable assumptions on

(

;

) and

(

;

the quantities

(

)

√

and

n,Whittle

(

)

√

are of the form (

−E

[

])

√

, where

t,s=1

b(t −s)Y

(1.3)

and

Z → R

is an even function. Consequently, proving asymptotic normality of

(

)

√

and

n,Whittle

(

)

√

reduces to determining for which processes (

)

t∈Z

and functions

, (

−E

[

])

√

converges in distribution to a centered Gaussian

random variable. In the case where (

)

t∈Z

is Gaussian and

(

) =

−π

ity

(

)

, the

papers [1, 14] give conditions on

and the spectral density of (

)

t∈Z

ensuring that

such weak convergence holds. Moreover, Fox and Taqqu [13] proved non-central limit

138

1 · Introduction

theorems for (an adequately normalized version of)

(1.3)

in case

(

) where

is a Hermite polynomial and (

)

t∈Z

is a normalized Gaussian sequence with a slowly

decaying autocovariance function. In particular, they showed that the limit can be

both Gaussian and non-Gaussian depending on the decay rate of the autocovariances.

Later, Giraitis and Surgailis [15] left the Gaussian framework and considered instead

general linear processes of the form

s∈Z

t−s

, t ∈ Z, (1.4)

where (

)

t∈Z

is an i.i.d. sequence with

[

] = 0 and

[

]

< ∞

, and

t∈Z

< ∞

They provided suﬃcient conditions (in terms of

and the autocovariance function of

(

)

t∈Z

) ensuring that (

−E

[

])

√

tends to a Gaussian limit. Many interesting

processes are given by

(1.4)

, the short-memory ARMA processes and the long-memory

ARFIMA processes being the main examples, and their properties have been studied

extensively. The literature on these processes is overwhelming, and the following

references form only a small sample: [7, 11, 16, 18].

The continuous-time analogue of

(1.4)

is the moving average process (

)

t∈R

given

ϕ(t −s) dL

, t ∈ R, (1.5)

where (

)

t∈R

is a two-sided Lévy process with

[

] = 0 and

[

]

< ∞

, and where

ϕ : R → R

is a function in

. Among popular and well-studied continuous-time

moving averages are the CARMA processes, particularly the Ornstein–Uhlenbeck

process, and solutions to linear stochastic delay diﬀerential equations (see [6, 9, 17,

19] for more on these processes). Bai et al. [2] considered a continuous-time version

(1.3)

, where sums are replaced by integrals and (

)

t∈Z

by (

)

t∈R

deﬁned in

(1.5)

and they obtained conditions on

and

ensuring both a Gaussian and non-Gaussian

limit for (a suitably normalized version of) the quadratic form.

Our main contribution is Theorem 1.1, which gives suﬃcient conditions on

and

ensuring that (

−E

[

])

√

converges in distribution to a centered Gaussian

random variable when

t∆

t ∈ Z

, for some ﬁxed

∆ >

0. In the formulation we

denote by

the fourth cumulant of

and by

the autocovariance function of

)

t∈R

(see the formula in (3.3)).

Theorem 1.1.

Let (

)

t∈R

be given by

(1.5)

and deﬁne

as in

(1.3)

with

t∆

for

some ∆ > 0. Suppose that one of the following statements holds:

(i) There exist α,β ∈ [1,2] with 2/α + 1/β ≥ 5/2, such that

t∈Z

|b(t)|

< ∞ and



t 7−→

s∈Z

|ϕ(t + s∆)|



∈ L

4/κ

([0,∆]) for κ = α, 2.

(ii) The function ϕ belongs to L

and there exist α,β > 0 with α + β < 1/2, such that

sup

t∈R

|t|

1−α/2

|ϕ(t)| < ∞ and sup

t∈Z

|t|

1−β

|b(t)|< ∞.

139

Paper F · Limit theorems for quadratic forms and related quantities of discretely sampled

continuous-time moving averages

Then, as

n → ∞

, (

−E

[

])

√

tends to a Gaussian random variable with mean zero

and variance

= κ

∆



s∈Z

ϕ(t + s∆)

u∈Z

b(u)ϕ(t + (s + u)∆)



+ 2

s∈Z



u∈Z

b(u)γ

((s + u)∆)



While the statement in (i) is more general than the statement in (ii) of Theo-

rem 1.1, the latter provides an easy-to-check condition in terms of the decay of

and

at inﬁnity. This decay condition is mild enough to apply to many interesting

choices of (

)

t∈R

, including some situations where long memory is present (see, e.g.,

Example 3.11). Theorem 1.1 relies on an approximation of

by a quantity of the

type

t=1

t∆

, (1.6)

where (

)

t∈R

and (

)

t∈R

are moving averages of the form

(1.5)

, and a limit theorem

for (

−E

[

])

√

. This idea is borrowed from [15]. Although we can use the same

overall idea, (

t∆

)

t∈Z

is generally not of the form

(1.4)

and, due to the interplay

between the continuous-time speciﬁcation

(1.5)

and the discrete-time (low frequency)

sampling scheme, the spectral density and related quantities become less tractable.

The conditions of Theorem 1.1 are similar to the rather general results of [2], which

concerned the continuous-time version of

(1.3)

. A reason that we obtain conditions of

the same type as [2] is that our proofs, too, rely on (various modiﬁcations of) Young’s

inequality for convolutions. Since the setup of that paper requires a continuum of

observations of (X

)

t∈R

, those results cannot be applied in our case.

In addition to its purpose as a tool in the proof of Theorem 1.1, a limit theorem

for (

−E

[

])

√

is of independent interest, e.g., since it is of the same form as the

(scaled) sample autocovariance of

(1.5)

and of

(

)

√

when

is a suitable least

squares objective function (see Examples 3.3 and 3.4 for details). For this reason, we

present our limit theorem for (S

−E[S

])/

√

n here:

Theorem 1.2.

Let (

)

t∈R

and (

)

t∈R

be as in

(1.5)

with corresponding kernels

,ϕ

∈

and deﬁne S

by (1.6). Suppose that one of the following statements holds:

(i) There exist α

,α

∈ [1,2] with 1/α

+ 1/α

≥ 3/2, such that



t 7−→

s∈Z



|ϕ

(t + s∆)|

+ ϕ

(t + s∆)



∈ L

([0,∆]) for i = 1,2.

(ii)

The functions

and

belong to

and there exist

,α

∈

1) with

3/2, such that

sup

t∈R

|t|

|ϕ

(t)| < ∞ for i = 1,2.

140

2 · Preliminaries

Then, as

n → ∞

, (

−E

[

])

√

tends to a Gaussian random variable with mean zero

and variance

= κ

∆



s∈Z

(t + s∆)ϕ

(t + s∆)



dt + E[L

]

s∈Z



(t)ϕ

(t + s∆) dt

(t)ϕ

(t + s∆) dt +

(t)ϕ

(t + s∆) dt

(t)ϕ

(t + s∆) dt



As was the case in Theorem 1.1, statement (i) is more general than statement (ii)

of Theorem 1.2, but the latter may be convenient as it gives conditions on the decay

rate of

and

at inﬁnity. In relation to Theorem 1.2, it should be mentioned that

limit theorems for the sample autocovariances of moving average processes

(1.5)

have

been studied in [5, 10, 25].

The paper is organized as follows: Section 2 recalls the most relevant concepts

in relation to Lévy processes and the corresponding integration theory. Section 3

presents Theorems 3.1 and 3.5, which are our most general central limit theorems

for

and

, and from which we will deduce Theorems 1.1 and 1.2 as special cases.

Moreover, Section 3 provides examples demonstrating that the imposed conditions

(or

and

) are satisﬁed for CARMA processes, solutions to stochastic delay

equations and certain fractional (Lévy) noise processes. Finally, Section 4 contains

proofs of all the statements of the paper together with a few supporting results.

2 Preliminaries

In this section we introduce some notation that will be used repeatedly and we recall a

few concepts related to Lévy processes and integration of deterministic functions with

respect to them. For a detailed exposition of Lévy processes and the corresponding

integration theory, see [23, 24].

For a given measurable function

f : R → R

and

p ≥

1 we write

f ∈ L

|f |

integrable with respect to the Lebesgue measure and

f ∈ L

∞

is bounded almost

everywhere. For a given function

a: Z → R

(or sequence (

(

))

t∈Z

) we write

a ∈ `

kak

B (

t∈Z

|a(t)|

)

1/p

< ∞ and a ∈ `

∞

if kak

∞

B sup

t∈Z

|a(t)| < ∞.

A stochastic process (

)

t≥0

= 0, is called a one-sided Lévy process if it is càdlàg

and has stationary and independent increments. The distribution of (

)

t≥0

is char-

acterized by

as a consequence of the relation

logE

[

exp{iyL

}

] =

t logE

[

exp{iyL

}

By the Lévy–Khintchine representation it holds that

logE

iyL

= iyγ −

iyx

−1 −iyx1

{|x|≤1}

)ν(dx), y ∈R,

for some

γ ∈ R

≥

0 and Lévy measure

, and hence (the distribution of) (

)

t≥0

may be summarized as a triplet (

γ,ρ

,ν

). The same holds for a (two-sided) Lévy

process (

)

t∈R

which is constructed as

t≥0

−L

(−t)−

t<0

, where (

)

t≥0

and

)

t≥0

are one-sided Lévy processes which are independent copies.

Let (

)

t∈R

be a Lévy process with

[

]

< ∞

and

[

] = 0. Then, for a given

measurable function

f : R → R

, the integral

(

)

is well-deﬁned (as a limit in

probability of integrals of simple functions) and belongs to L

(P), p ≥ 1, if



|f (t)x|

∧|f (t)x|



ν(dx) dt < ∞. (2.1)

141

Paper F · Limit theorems for quadratic forms and related quantities of discretely sampled

continuous-time moving averages

In particular,

(2.1)

is satisﬁed if

f ∈ L

∩L

and

|x|>1

|x|

(

)

< ∞

, the latter condition

being equivalent to

[

]

< ∞

. Finally, when

(2.1)

holds for

= 2 we will often

make use of the isometry property of the integral map:



f (t) dL





= E[L

]

f (t)

dt.

3 Further results and examples

As in the introduction, it will be assumed throughout that (

)

t∈R

is a two-sided Lévy

process with

[

] = 0 and

[

]

< ∞

. Set

[

] and

[

]

−

. Moreover,

for functions ϕ,ϕ

,ϕ

: R → R in L

deﬁne

ϕ(t −s) dL

and X

(t −s) dL

(3.1)

for t ∈ R and i = 1,2. We will be interested in the quantities

t=1

t∆

and Q

t,s=1

b(t −s)X

t∆

s∆

(3.2)

for a given

∆ >

0 and an even function

b : Z → R

. Our main results, Theorems 3.1

and 3.5, provide a central limit theorem for the quantities in

(3.2)

and are more

general than Theorems 1.1 and 1.2 which were presented in Section 1. Before the

formulations we deﬁne the autocovariance function of (X

)

t∈R

(h) = E[X

] = σ

ϕ(t)ϕ(t + h) dt, h ∈ R, (3.3)

as well as the autocovariance (crosscovariance) functions of (X

)

t∈R

and (X

)

t∈R

(h) = E[X

] = σ

(t)ϕ

(t + h) dt, h ∈ R. (3.4)

Theorem 3.1. Suppose that the following conditions hold:

(i)

|ϕ

(t)ϕ

(t + · ∆)|dt ∈ `

for i = 1,2 and α

,α

∈ [1,∞] with 1/α

+ 1/α

= 1.

(ii)

|ϕ

(t)ϕ

(t + · ∆)|dt ∈ `

(iii)



t 7−→ κ

kϕ

(t + · ∆)ϕ

(t + · ∆)k



∈ L

([0,∆]).

Then, as

n → ∞

, (

−E

[

])

√

tends to a Gaussian random variable with mean zero

and variance

= κ

∆



s∈Z

(t + s∆)ϕ

(t + s∆)



dt +

s∈Z

(s∆)γ

(s∆)

s∈Z

(s∆)γ

(s∆).

(3.5)

142

3 · Further results and examples

Remark 3.2.

= 0, equivalently (

)

t∈R

is a Brownian motion, assumption (iii)

of Theorem 3.1 is trivially satisﬁed and the ﬁrst term in the variance formula

(3.5)

vanishes.

Loosely speaking, assumptions (i)–(ii) of Theorem 3.1 concern summability of

continuous-time convolutions. Hence, by relying on a modiﬁcation of Young’s convo-

lution inequality, Theorem 1.2 can be shown to be a special case of Theorem 3.1 (see

Lemma 4.3 and the following proof of Theorem 1.2 in Section 4). Examples 3.3 and

3.4 are possible applications of Theorem 3.1.

Example 3.3.

Let

n,m ∈ N

with

m < n−

1, deﬁne the sample autocovariance of (

)

t∈R

based on X

∆

2∆

,... , X

n∆

up to lag m as

(j) = n

−1

n−j

t=1

t∆

(t+j)∆

, j = 1,..., m, (3.6)

and set

= [

(1)

,... ,

(

)]

. Moreover, let

(

) = [

(

∆

)

,... , ϕ

(

m∆

)]

and

= [

((

+ 1)

∆

)

,... , γ

((

)

∆

)]

using the notation as in

(3.1)

and

(3.3)

. Then, for

a given α = [α

,... , α

]

∈ R

, it holds that

−α

= n

−1

t=1



t∆

−E[X

]



+ O

−1

), (3.7)

where (

)

t∈R

and (

)

t∈R

are given by

(3.1)

with

and

(

) =

(

). Here

(

−1

) in

(3.7)

means that the equality holds up to a term

which is stochastically

bounded by n

−1

(that is, (nε

)

n∈N

is tight). Then if

|ϕ(t)ϕ(t + · ∆)| dt ∈ `

and



t 7−→ kϕ(t + · ∆)k



∈ L

([0,∆]), (3.8)

assumptions (i)–(iii) of Theorem 3.1 hold and we deduce that

√

(

−γ

) converges

in distribution to a centered Gaussian random variable with variance α

Σα, where

Σ = κ

∆

K(t)K(t)

dt +

s∈Z

(γ

+ γ

−s

)γ

, K(t) B

s∈Z

ϕ(t + s∆)

ϕ(t + s∆).

By the Cramér–Wold theorem we conclude that

√

(

−γ

) converges in distribution

to a centered Gaussian vector with covariance matrix

. This type of central limit

theorem for the sample autocovariances of continuous-time moving averages was

established in [10] under the same assumptions on ϕ as imposed above.

Example 3.4.

Motivated by the discussion in the introduction, this example will

illustrate how Theorem 3.1 can be applied to show asymptotic normality of the

(adequately normalized) derivative of a least squares objective function. Fix

k ∈ N

let v : R → R

be a diﬀerentiable function with derivative v

and consider

(θ) =

t=k+1



t∆

−v(θ)

X(t)



, θ ∈ R, (3.9)

143

Paper F · Limit theorems for quadratic forms and related quantities of discretely sampled

continuous-time moving averages

where X(t) = [X

(t−1)∆

,... , X

(t−k)∆

]

. In this case

(θ) = −2

t=k+1



t∆

−v(θ)

X(t)



(θ)

X(t), θ ∈ R,

and hence it is of the same form as

(3.2)

with

(

) = [

−

(

)

]

(

) and

(

) = [0

(

)

]

(

), where

(

) = [

(

)

,ϕ

(

t −∆

)

,... , ϕ

(

t −k∆

)]

. Suppose that

(

)

coincides with the vector of coeﬃcients of the

(

) projection of

(k+1)∆

onto the

linear span of

k∆

,... , X

∆

for some

∈ R

. In this case

[

(

)] = 0, and if

(3.8)

holds it thus follows from Theorem 3.1 that

(

)

√

converges in distribution to a

centered Gaussian random variable.

Theorem 3.5 is our most general result concerning the limiting behavior of (

−

[

])

√

n → ∞

. For notational convenience we will, for given

a: Z → R

and

f : R → R, set

(a ? f )(t) B

s∈Z

a(s)f (t −s∆) (3.10)

for any

t ∈ R

, such that

s∈Z

(

)

(

t − s∆

)

| < ∞

. If

and

are non-negative, the

deﬁnition in

(3.10)

is used for all

t ∈ R

. Moreover, we write

|a|

(

) =

(

)

and

|f |

(

) =

|f (t)|.

Theorem 3.5. Suppose that the following statements hold:

(i)

There exist

α,β ∈

,∞

] with 1

/α

+ 1

/β

= 1, such that

|ϕ

(

)

(

· ∆

)

| dt ∈ `

and

(|b|? |ϕ|)(t)(|b|? |ϕ|)(t + · ∆) dt ∈ `

(ii)

|ϕ(t)|(|b|? |ϕ|)(t + · ∆) dt ∈ `

(iii)



t 7−→ κ

kϕ(t + · ∆)(|b|? |ϕ|)(t + · ∆)k



∈ L

([0,∆]).

Then, as

n → ∞

, (

−E

[

])

√

converges in distribution to a Gaussian random variable

with mean zero and variance

= κ

∆



s∈Z

ϕ(t + s∆)(b ? ϕ)(t + s∆)



dt + 2k(b ? γ

)( · ∆)k

. (3.11)

Remark 3.6.

The idea in the proof of Theorem 3.5 is to approximate

with

and

b ? ϕ

. The conditions imposed in Theorem 3.5 correspond to

assuming that

and

|b| ? |ϕ|

satisfy (i)–(iii) of Theorem 3.1. In particular, these

conditions ensure that

is well-deﬁned and that Theorem 3.1 applies to this choice

and

. The only lacking part in order to deduce Theorem 3.5 from Theorem 3.1

is to show that

is in fact a proper approximation of

in the sense that

Var

(

−

)

/n →

0 as

n → ∞

, but this is veriﬁed in Section 4 where the proofs of the stated

results can be found.

Remark 3.7. Note that for any s ∈ Z with b(s) , 0, it holds that

|ϕ(t)| ≤ |b(s)|

−1

(|b|? |ϕ|)(t + s∆) for all t ∈ R. (3.12)

144

3 · Further results and examples

This fact ensures that assumptions (i)–(ii) of Theorem 3.5 hold if there exists

β ∈

such that

(|b|? |ϕ|)(t)(|b|? |ϕ|)(t + · ∆) dt ∈ `

. (3.13)

(Here we exclude the trivial case

b ≡

0.) Indeed, if

(3.13)

is satisﬁed we can choose

α ≥ β

such that 1

/α

+ 1

/β

= 1 and then assumptions (i)–(ii) are met due to the

inequality in (3.12) and the fact that `

⊆ `

∩`

Remark 3.8.

We will now brieﬂy comment on the conditions of Theorems 1.1 and 1.2,

particularly on suﬃcient conditions for applying Theorems 3.1 and 3.5. We will

restrict our attention to assumptions of the type



t 7−→ kψ(t + · ∆)k



∈ L

([0,∆]), (3.14)

where

ψ : R → R

is a measurable function and

κ ≥

1. First of all, note that the

weaker condition (

t 7→ kψ

(

· ∆

)

∈ L

([0

,∆

]) is satisﬁed if and only if

ψ ∈ L

, and

condition

(3.14)

implies

ψ ∈ L

2κ

. In particular, a necessary condition for

(3.14)

hold is that ψ ∈ L

∩L

2κ

. On the other hand, one may decompose kψ(t + · ∆)k

kψ(t + · ∆)k

s=−M

|ψ(t + s∆)|

∞

s=M+1



|ψ(t + s∆)|

+ |ψ(t −s∆)|



(3.15)

for any

M ∈ N

. The ﬁrst term on right-hand side of

(3.15)

belongs to

([0

,∆

]) (viewed

as a function of

) if

ψ ∈ L

2κ

. If in addition

ψ ∈ L

, the second term on the right-hand

tends to zero as

M → ∞

for (Lebesgue almost) all

t ∈

,∆

]. If this could be assumed

to hold uniformly across all

, that is, if the second term belongs to

∞

([0

,∆

]) for

a suﬃciently large

, then

(3.14)

would be satisﬁed. Therefore, loosely speaking,

the diﬀerence between

∩L

2κ

and the space of functions satisfying

(3.14)

consists

of functions

where the second term in

(3.15)

tends to zero pointwise, but not

uniformly, in

M → ∞

. Ultimately, this is a condition on the behavior of the tail

of the function between grid points. For instance, if there exists a sequence (

)

s∈Z

such that

sup

t∈[0,∆]

|ψ

(

t ±s∆

)

| ≤ ψ

for all suﬃciently large

, then

(3.14)

holds.

An assumption such as

(3.14)

seems to be necessary and is the cost of considering

a continuous-time process only on a discrete-time grid. In [10], where they prove a

central limit theorem for the sample autocovariance of a continuous-time moving

average in a low frequency setting, a similar condition is imposed.

In the following examples we will put some attention on concrete speciﬁcations of

moving average processes, where the behavior of the corresponding kernel is known,

and hence Theorems 1.1 and 1.2 may be applicable.

Example 3.9.

Fix

p ∈ N

and let

(

) =

p−1

···

and

(

) =

···

p−1

z ∈ C

, be two real polynomials where all the zeroes of

are contained in

{z ∈ C

(

)

}

. Moreover, let

q ∈ N

with

q < p

and suppose that

= 1 and

= 0

for q < k ≤ p −1. Finally, deﬁne

A =







0 1 0 ··· 0

0 0 1 ··· 0

0 0 0 ··· 1

−a

p−1

−a

p−2

··· −a







, b =







p−2

p−1







and e













145

Paper F · Limit theorems for quadratic forms and related quantities of discretely sampled

continuous-time moving averages

Then the corresponding (causal) CARMA(p,q) process (X

)

t∈R

is given by

−∞

A(t−u)

, t ∈ R. (3.16)

(See [21, Remark 3.2].) The deﬁnition in

(3.16)

is based on a state-space representation

of the more intuitive formal diﬀerential equation

P (D)X

= Q(D)DL

, t ∈ R, (3.17)

where

denotes diﬀerentiation with respect to time. Equation

(3.17)

should be

compared to the corresponding representation of an ARMA process in terms of the

backward-shift operator. Since it can be shown that the eigenvalues of

correspond

to the roots of

, the kernel

ϕ : t 7→ 1

[0,∞)

(

)

is exponentially decaying at

inﬁnity. Combining this with the (absolute) continuity of

on [0

,∞

) ensures that the

kernel belongs to

∞

as well. In particular, this shows that Theorem 1.1(i) holds as

long as b ∈ `

. For more on CARMA processes, we refer to [6, 8, 9].

Example 3.10. Let η be a ﬁnite signed measure on [0,∞) and suppose that

z −

[0,∞)

−zt

η(dt) , 0

for every

z ∈ C

with

(

)

≥

0. Then it follows from [3, Theorem 3.4] that the unique

stationary solution (

)

t∈R

to the corresponding stochastic delay diﬀerential equation

[0,∞)

t−s

η(ds) dt + dL

, t ∈ R,

takes the form

−∞

(

t −s

)

, where

ϕ : R → R

is characterized as the unique

function satisfying ϕ(t) = 0 for t < 0 and

ϕ(t) = 1 +

[0,∞)

ϕ(s −u)η(du) ds, t ≥ 0.

Consequently, it follows form the integration by parts formula that

sup

t≥0

|ϕ(t)| ≤p

∞

p−1

|ϕ(t)| dt + 2

|η|([0,∞))

∞

|ϕ(t)| dt

+ 2

[0,∞)

|η|(dt)

∞

|ϕ(t)| dt

(3.18)

for a given

p ≥

1. Here

|η|

is the variation measure of

. If one assumes that

|η|

has

moments up to order p + 1, that is,

[0,∞)

p+1

|η|(dt) < ∞,

it follows by [3, Lemma 3.2] that the measure

|ϕ

(

)

| dt

is ﬁnite and has moments up

to order

. Consequently, under this assumption we have that

sup

t≥0

|ϕ

(

)

| < ∞

(3.18) and Theorem 1.1(ii) holds as long as sup

t∈Z

|t|

1/2+δ

|b(t)|< ∞ for some δ > 0.

146

4 · Proofs

Example 3.11. Suppose that (X

)

t∈R

is given by (3.1) with

ϕ(t) =

Γ (1 + d)

−(t −1)

, t ∈ R,

and

d ∈

4). (Here

(1+

) =

∞

−u

is the Gamma function at 1+

.) In other

words, we assume that (

)

t∈R

is a fractional Lévy noise with parameter

. Recall that

(

)

∼ ch

2d−1

h → ∞

for a suitable constant

c >

0 (see, e.g., [20, Theorem 6.3]),

and hence we are in a setup where

s∈Z

|γ

(s∆)| = ∞, but

s∈Z

(s∆)

< ∞.

Moreover, it is shown in [10, Theorem A.1] that (

t∆

)

t∈Z

is not strongly mixing.

However, Theorems 1.1 and 1.2 may still be applied in this setup, since

is vanishing

on (−∞,0), continuous on R, and ϕ(t) ∼ d t

d−1

/Γ (1 + d) as t → ∞.

4 Proofs

The ﬁrst observation will be used in the proof of Theorem 3.1.

Lemma 4.1. Let g

: R → R be functions in L

∩L

. Then it holds that



j=1

(u) dL



= κ

j=1

(u) du

+ σ



(u)g

(u) du

(u)g

(u) du

(u)g

(u) du

(u)g

(u) du

(u)g

(u) du

(u)g

(u) du



(4.1)

Proof. Set Y

(u) dL

. Then, using [16, Proposition 4.2.2], we obtain that

E[Y

] = Cum(Y

) + E[Y

]E[Y

]

+ E[Y

]E[Y

] + E[Y

]E[Y

(4.2)

where

Cum(Y

) =

∂

∂u

···∂u

logE

i(u

+···+u

)



=···=u

Set

(

) =

logE

[

iuL

] for

u ∈ R

. It follows from the Lévy–Khintchine representation

that we can ﬁnd a constant

C >

0 such that

|ψ

(1)

(

)

| ≤ C|u|

and

|ψ

(m)

(

)

| ≤ C

for

= 2

4. (Here

(m)

is the

th derivative of

.) Using this together with the

representation

logE

i(u

+···+u

)

(t) + ···+ u

(t)) dt,

147

Paper F · Limit theorems for quadratic forms and related quantities of discretely sampled

continuous-time moving averages

see [23], we can interchange diﬀerentiation and integration to obtain

Cum(Y

)

(4)

(t) + ···+ u

(t))

j=1

(t) dt



=···=u

= κ

j=1

(t) dt.

By combining this observation with the fact that

[

] =

(

)

(

)

(using

the isometry property), the result is an immediate consequence of (4.2). 

Remark 4.2.

In case

, Lemma 4.1 collapses to [10, Lemma 3.2], and

= 0 then (

)

t∈R

is a Brownian motion and the result is a special case of Isserlis’

theorem.

We are now ready to prove Theorem 3.1.

Proof of Theorem 3.1.

The proof goes by approximating (

t∆

)

t∈Z

by a

-depen-

dent sequence (cf. [7, Deﬁnition 6.4.3]), to which we can apply a classical central

limit theorem. Fix m > 0, and set ϕ

= [(−m) ∨ϕ

∧m]1

[−m,m]

and

i,m

(t −s) dL

t+m

t−m

(t −s) dL

, t ∈ R,

for i = 1,2. Furthermore, set

t=1

1,m

t∆

2,m

t∆

, n ∈ N.

Note that since

∈ L

∩L

and

(

) = 0 when

|t| > m

, (

1,m

t∆

2,m

t∆

)

t∈Z

is a

(

dependent sequence of square integrable random variables, where

(

) =

inf{n ∈ N

n ≥ 2m/∆}. Hence, we can apply [7, Theorem 6.4.2] to deduce that

−E[S

]

√

−−−→Y

, n → ∞,

where Y

is a Gaussian random variable with mean zero and variance

k(m)

s=−k(m)

1,m

2,m

(s∆). (4.3)

Here

1,m

2,m

denotes the autocovariance function of (

1,m

2,m

)

t∈R

. Next, we need to

argue that

→ η

with

given by

(3.5)

. Since

∈ L

∩L

we can use Lemma 4.1

to compute γ

1,m

2,m

(s∆) for each s ∈ Z:

1,m

2,m

(s∆)

= κ

(t)ϕ

(t + s∆)ϕ

(t + s∆) dt + σ

(t)ϕ

(t + s∆) dt

(t)ϕ

(t + s∆) dt + σ

(t)ϕ

(t + s∆) dt ·

(t)ϕ

(t + s∆) dt.

(4.4)

148

4 · Proofs

Note that

(

)

(

s∆

)

dt → γ

(

s∆

), since

→ ϕ

. By using assump-

tion (iii) and that

F : t 7→

s∈Z

|ϕ

(

s∆

)

(

s∆

)

is a periodic function with period

∆ we establish as well that

s∈Z

|κ

(t)ϕ

(t + s∆)ϕ

(t + s∆)| dt

= κ

s∈Z

(s+1)∆

s∆

|ϕ

(t)ϕ

(t)|F(t) dt = κ

∆

F(t)

dt < ∞.

(4.5)

In particular, Lebesgue’s theorem on dominated convergence implies

(t)ϕ

(t + s∆)ϕ

(t + s∆) dt → κ

(t)ϕ

(t + s∆)ϕ

(t + s∆) dt.

Combining these observations with

(4.4)

shows that

1,m

2,m

(

s∆

)

→ γ

for each

s ∈ Z

where

= κ

(t)ϕ

(t + s∆)ϕ

(t + s∆) dt + γ

(s∆)γ

(s∆) + γ

(s∆)γ

(s∆)

It follows as well from (4.4) that

|γ

1,m

2,m

(s∆)|

≤ κ

|ϕ

(t)ϕ

(t + s∆)ϕ

(t + s∆)| dt + σ

|ϕ

(t)ϕ

(t + s∆)| dt

|ϕ

(t)ϕ

(t + s∆)| dt + σ

|ϕ

(t)ϕ

(t + s∆)| dt ·

|ϕ

(t)ϕ

(t + s∆)| dt.

(4.6)

Thus, if we can argue that the three terms on the right-hand side of

(4.6)

are summable

over

s ∈ Z

, we conclude from

(4.3)

that

→

s∈Z

by dominated convergence.

(4.5)

it was shown that the ﬁrst term is summable. For the second term we apply

Hölder’s inequality to obtain



|ϕ

(t)ϕ

(t + · ∆)| dt

|ϕ

(t)ϕ

(t + · ∆)| dt



≤

i=1



|ϕ

(t)ϕ

(t + · ∆)| dt



which is ﬁnite by assumption (i). The last term is handled in the same way using the

Cauchy–Schwarz inequality and assumption (ii):



|ϕ

(t)ϕ

(t + · ∆)|dt

|ϕ

(t)ϕ

(t + · ∆)|dt



≤



|ϕ

(t)ϕ

(t + · ∆)|dt



< ∞.

Consequently,

converges in distribution to a Gaussian random variable with mean

zero and variance

. In light of this, the result is implied by [7, Proposition 6.3.10] if

the following condition holds:

∀ε > 0: lim

m→∞

limsup

n→∞





−1/2

−E[S

]) −n

−1/2

−E[S

])



> ε



= 0. (4.7)

149

Paper F · Limit theorems for quadratic forms and related quantities of discretely sampled

continuous-time moving averages

In order to show (4.7) we ﬁnd for ﬁxed m, using [7, Theorem 7.1.1],

limsup

n→∞

h

−1/2

−E[S

]) −n

−1/2

−E[S

])



= limsup

n→∞

h

−1

s=1



s∆

−X

1,m

s∆

2,m

s∆



−E

−X

1,m

2,m



s∈Z

−X

1,m

2,m

(s∆)

where

−X

1,m

2,m

is the autocovariance function for (

−X

1,m

2,m

)

t∈R

. First,

we will establish that

1,m

2,m

→ X

(

). To this end, recall that if a measur-

able function

f : R

→ R

is square integrable (with respect to the Lebesgue measure

), and

t 7→ f

(

t,t

) and

t 7→ κ

(

t,t

) belong to

and

, respectively, then the

two-dimensional with-diagonal (Stratonovich type) integral

(

) of

with respect

to (L

)

t∈R

is well-deﬁned and by the Hu–Meyer formula,

E[I

(f )

] ≤ C



f (s,t)

d(s,t) + κ

f (t, t)

dt +



f (t, t) dt





(4.8)

for a suitable constant

C >

0. A fundamental property of the Stratonovich integral is

that it satisﬁes the relation

(f ) =

g(t) dL

h(t) dL

when

(

s, t

) =

g ⊗h

(

s, t

)

B g

(

)

(

) for given measurable functions

g,h : R → R

such

that

g,h, gh ∈ L

. (See [2, 12] for details.) Since

∈ L

according to

(4.5)

, we can

write

(

⊗ϕ

(

− ·

)

−ϕ

⊗ϕ

(

− ·

)) =

−X

1,m

2,m

, and hence

(4.8)

shows that

h

−X

1,m

2,m



≤C





(s)ϕ

(t) −ϕ

(s)ϕ

(t)



d(s,t) + κ



(t)ϕ

(t) −ϕ

(t)ϕ

(t)





(t)ϕ

(t) dt −

(t)ϕ

(t) dt





(4.9)

for a suitable constant

C >

0. It is clear that the three terms on the right-hand side of

(4.9)

tend to zero as

tends to inﬁnity by dominated convergence, and thus we have

that

1,m

2,m

→ X

(

). In particular, this shows that

−X

1,m

2,m

(

s∆

)

→

m → ∞

for each

s ∈ Z

. By using the same type of bound as in

(4.6)

, we establish

the existence of a function

h: Z →

,∞

) in

with

|γ

−X

1,m

2,m

(

s∆

)

| ≤ h

(

) for all

s ∈ Z and, consequently,

lim

m→∞

limsup

n→∞

h

−1/2

−E[S

]) −n

−1/2

−E[S

])



= lim

m→∞

s∈Z

−X

1,m

2,m

(s∆) = 0

according to Lebesgue’s theorem. In light of (4.7), we have ﬁnished the proof. 

Relying on the ideas of Young’s convolution inequality, we obtain the following

lemma:

150

4 · Proofs

Lemma 4.3. Let α,β,γ ∈ [1,∞] satisfy 1/α + 1/β −1 = 1/γ. Suppose that



t 7−→ kf (t + · ∆)k



∈ L

2α

([0,∆]) and



t 7−→ kg(t + · ∆)k



∈ L

2β

([0,∆]).

Then it holds that

|f (t)g(t + · ∆)| dt ∈ `

Proof.

First observe that, for any measurable function

h: R → R

and

p ∈

,∞

h ∈ L

if and only if

t 7→ kh

(

· ∆

)

belongs to

([0

,∆

]). In particular, this ensures that

f ∈ L

and

g ∈ L

. If

∞

then 1

/α

+ 1

/β

= 1, and the result follows immediately

from Hölder’s inequality. Hence, we will restrict the attention to

γ < ∞

, in which

case we necessarily also have that

α,β < ∞

. First, consider the case where

α,β , γ

or equivalently

α,β,γ >

1, and set

α/

(

α −

1) and

β/

(

β −

1). Note that these

deﬁnitions ensure that

−β/γ

) =

−α/γ

) =

and 1

/α

/β

/γ

= 1. Hence,

using the Hölder inequality and the facts that f ∈ L

and g ∈L

|f (t)g(t + s∆)| dt ≤



|f (t)|

|g(t + s∆)|



1/γ



|f (t)|

(1−α/γ)



1/β



|g(t + s∆)|

(1−β/γ)



1/α

= M

1/γ



|f (t)|

|g(t + s∆)|



1/γ

for a suitable constant

M < ∞

. By raising both sides to the

th power, summing over

s ∈ Z and applying the Cauchy–Schwarz inequality we obtain that



|f (t)g(t + · ∆)| dt



≤ M

|f (t)|

kg(t + · ∆)k

≤ M



∆

kf (t + · ∆)k

2α



1/2



∆

kg(t + · ∆)k

2β



1/2

(4.10)

which is ﬁnite, and thus we have ﬁnished the proof in case

α,β , γ

. If, e.g.,

α , β

then

α >

1. Again, set

α/

(

α −

1) and note that 1

/α

+ 1

/γ

= 1, so the Hölder

inequality ensures that

|f (t)g(t + s∆)| dt ≤



|f (t)|

|g(t + s∆)|



1/γ



|g(t)|



1/α

and hence the inequalities in

(4.10)

hold in this case as well for a suitable constant

M > 0. Finally if α = β = γ = 1, we compute that



|f (t)g(t + · ∆)| dt



∆

kf (t + · ∆)k

kg(t + · ∆)k

≤



∆

kf (t + · ∆)k



1/2



∆

kg(t + · ∆)k



1/2

< ∞,

which ﬁnishes the proof. 

151

Paper F · Limit theorems for quadratic forms and related quantities of discretely sampled

continuous-time moving averages

Proof of Theorem 1.2.

To show that statement (i) implies the stated weak conver-

gence of (

−E

[

])

√

, it suﬃces to check that assumptions (i)–(iii) of Theorem 3.1

are satisﬁed. Initially note that, in view of the observation in the beginning of the

proof of Lemma 4.3, the imposed assumptions imply that

∈ L

and (t 7−→kϕ

(t + · ∆)k

) ∈ L

2β

([0,∆]) for all β ∈ [α

,2].

Since

∈

−1 : α

≤ β

≤ 2

we can thus assume that

,α

∈

2] are given such that 1

/α

+ 1

/α

−

1 = 1

2. Next,

deﬁne

by the relation 1

/γ

= 2

/α

−

1 if

2 and

∞

= 2. In this case,

/γ

+ 1

/γ

= 1. By applying Lemma 4.3 with

and

, we

deduce that (i) of Theorem 3.1 holds. Assumption (ii) of Theorem 3.1 holds as well

by Lemma 4.3 with

and

= 2. Finally, we have that

assumption (iii) of Theorem 3.1 is satisﬁed, since

∆

kϕ

(t + · ∆)ϕ

(t + · ∆)k

≤



∆

kϕ

(t + · ∆)k



1/2



∆

kϕ

(t + · ∆)k



1/2

< ∞,

where we have applied the Cauchy–Schwarz inequality both for sums and integrals.

The last part of the proof (concerning statement (ii) of the theorem) amounts to

showing that if

,ϕ

∈ L

and

,α

∈

1) are given such that

2 and

B sup

t∈R

|t|

|ϕ

(t)| < ∞, i = 1,2, (4.11)

then

t 7→ kϕ

(

· ∆

)

belongs to

([0

,∆

]) for

κ ∈ {β

}

where

∈

/α

2],

= 1

satisfy 1/β

+ 1/β

≥ 3/2. To show this, consider κ ∈ {β

,2} and write

kϕ

(t + · ∆)k

= |ϕ

(t + ∆)|

+ |ϕ

(t)|

+ |ϕ

(t −∆)|

∞

s=2

|ϕ

(t + s∆)|

∞

s=2

|ϕ

(t −s∆)|

(4.12)

for

t ∈

,∆

]. Since

∈ L

, the ﬁrst three terms on the right-hand side of

(4.12)

belong

to L

([0,∆]). The last two terms belong to L

∞

([0,∆]), since

sup

t∈[0,∆]

∞

s=2

|ϕ

(t ±s∆)|

≤ c

∆

−κα

∞

s=1

−κα

< ∞

by (4.11), and hence (t 7→ kϕ

(t + · ∆)k

) ∈ L

([0,∆]). 

Proof of Theorem 3.5. Initially, we note that

t=1

t∆

t−1

s=t−n

b(s)ϕ((t −s)∆ −u) dL

= S

−ε

−δ

, (4.13)

152

4 · Proofs

where

t=1

t∆

b ? ϕ(t∆ −u) dL

t=1

t∆

∞

s=t

b(s)ϕ((t −s)∆ −u) dL

and δ

t=1

t∆

t−n−1

s=−∞

b(s)ϕ((t −s)∆ −u) dL

As pointed out in Remark 3.6, the imposed assumptions ensure that Theorem 3.1 is

applicable with

and

|b|? |ϕ|

(in particular, when

b ? ϕ

), and thus

(

−E

[

])

√

−−−→ N

,η

) where

is given by

(3.5)

. By using that

is even we

compute

s∈Z

(s∆)

(b ? ϕ)(t)(b ? ϕ)(t + s∆) dt

s∈Z

u,v∈Z

b(u)b(v)γ

((s + u)∆)γ

((s + v)∆) = k(b ? γ

)( · ∆)k

and

s∈Z

ϕ(t)(b ? ϕ)(t + s∆) dt ·

(b ? ϕ)(t)ϕ(t + s∆) dt

s∈Z

u,v∈Z

b(u)b(v)γ

((s −u)∆)γ

((s + v)∆) = k(b ? γ

)( · ∆)k

and it follows that

coincides with

(3.11)

. In light of the decomposition

(4.13)

and

Slutsky’s theorem, we have shown the result if we can argue that

Var

(

)

/n →

0 and

Var

(

)

/n →

0 as

n → ∞

. We will only show that

Var

(

)

/n →

0, since arguments

verifying that

Var

(

)

/n →

0 are similar. Deﬁne

(

) =

(

)

(

t∆

)

and note

that we have the identities

E[ε

] = σ

t=1

s=−∞

a(t −s)b(t −s)

and E[ε

] =

t,s=1

∞

u=t

∞

v=s

b(u)b(v)E[X

t∆

s∆

(t−u)∆

(s−v)∆

]

t,s,u,v∈Z

b(t −u)b(s −v)E[X

t∆

s∆

u∆

v∆

{1≤t,s≤n}

{u,v≤0}

Moreover, with

c(t,s,u) =

ϕ(t∆ + v)ϕ(s∆ + v)ϕ(u∆ + v)ϕ(v) dv,

it follows by Lemma 4.1 that

E[X

t∆

s∆

u∆

v∆

] = κ

c(t −v,s −v,u −v) + σ

a(t −s)a(u −v)

+ σ

a(t −u)a(s −v) + σ

a(t −v)a(s −u)

153

Paper F · Limit theorems for quadratic forms and related quantities of discretely sampled

continuous-time moving averages

for any t, s,u, v ∈ Z. Thus, we establish the identity

−1

Var(ε

) = κ

−1

t,s,u,v∈Z

b(t −u)b(s −v)c(t −v, s −v,u −v)1

{1≤t,s≤n}

{u,v≤0}

+ σ

−1

t,s,u,v∈Z

a(t −s)a(u −v)b(t −u)b(s −v)1

{1≤t,s≤n}

{u,v≤0}

+ σ

−1

t,s,u,v∈Z

a(t −v)a(s −u)b(t −u)b(s −v)1

{1≤t,s≤n}

{u,v≤0}

(4.14)

It suﬃces to argue that each of the three terms on the right-hand side of

(4.14)

tends

to zero as

tends to inﬁnity. Regarding the ﬁrst term, by a change of variables from

(t,s,u,v) to (t −v, s −u,u −v,v), we have

−1

t,s,u,v∈Z

b(t −u)b(s −v)c(t −v, s −v,u −v)1

{1≤t,s≤n}

{u,v≤0}

= κ

t,s,u∈Z

b(t −u)b(s + u)c(t,s + u,u)n

−1

v∈Z

{1≤t+v,s+u+v≤n}

{u+v,v≤0}

(4.15)

Since for ﬁxed t,s,u ∈ Z,

v∈Z

{1≤t+v,s+u+v≤n}

{u+v,v≤0}

≤ min{|t|,n},

it will follow that the expression in

(4.15)

tends to zero as

tends to inﬁnity by

Lebesgue’s theorem on dominated convergence if

t,s,u∈Z

|b(t)b(s)c(t + u,s,u)| < ∞. (4.16)

To show

(4.16)

we use that the function

t 7→ κ

kϕ

(

· ∆

)(

|b|? |ϕ|

)(

· ∆

)

belongs

to L

([0,∆]) (by assumption (iii)) and is periodic with period ∆:

t,s,u∈Z

|b(t)b(s)c(t + u,s,u)|

≤ κ

u∈Z

|ϕ(v)|(|b|? |ϕ|)(v)|ϕ(v + u∆)|(|b|? |ϕ|)(v + u∆) dv

= κ

∆

kϕ(v + · ∆)(|b|? |ϕ|)(v + · ∆)k

dv < ∞.

Hence,

(4.15)

tends to zero. We will handle the second term on the right-hand side

(4.14)

in a similar way. In particular, by a change of variables from (

t,s,u,v

) to

(t,t −s, s −u,t −v),

−1

t,s,u,v∈Z

a(t −s)a(u −v)b(t −u)b(s −v)1

{1≤t,s≤n}

{u,v≤0}

s,u,v∈Z

a(s)a(v −u −s)b(s + u)b(v −s)n

−1

t∈Z

{1≤t,t−s≤n}

{t−s−u,t−v≤0}

(4.17)

For ﬁxed s,u,v ∈ Z,

t∈Z

{1≤t,t−s≤n}

{t−s−u,t−v≤0}

≤ min{|v|,n},

154

4 · Proofs

and since

s,u,v∈Z

|a(s)a(v −u −s)b(s + u)b(v −s)|

≤ kak



u,v∈Z

|a(v −u − · )b( · + u)b(v − · )|



≤



|ϕ(u)ϕ(u + · ∆)| du



(|b|? |ϕ|)(u)(|b|? |ϕ|)(u + · ∆) du



(4.18)

where the right-hand side is ﬁnite by assumption (i), it follows again by dominated

convergence that

(4.17)

tends to zero as

tends to inﬁnity. Finally, for the third term

on the right-hand side of

(4.14)

, we make a change of variables from (

t,s,u,v

) to

(t −u,s −t,u −v,v) and establish the inequality

−1

t,s,u,v∈Z

a(t −v)a(s −u)b(t −u)b(s −v)1

{1≤t,s≤n}

{u,v≤0}

≤

t,s,u∈Z

a(t + u)a(t + s)b(t)b(t + s + u)n

−1

min{|t + u|,n}.

(4.19)

The right-hand side of

(4.19)

tends to zero as

tends to inﬁnity by dominated conver-

gence using

(4.18)

and that

is even. Consequently,

(4.14)

shows that

Var

(

)

/n →

as n → ∞, which ends the proof. 

Proof of Theorem 1.1.

To show (i), deﬁne

γ ∈

2] by the relation 1

/γ

= 1

/α

/β −

and note that 1

/α

+ 1

/γ ≥

2. According to Remark 3.6 it suﬃces to check that the

assumptions of Theorem 3.1 are satisﬁed for the functions

and

|b|? |ϕ|

, which in

turn follows from the same arguments as in the proof of Theorem 1.2 if



t 7−→ kϕ(t + · ∆)k

+ kϕ(t + · ∆)k



∈ L

([0,∆]) (4.20)

and



t 7−→ k(|b|? |ϕ|)(t + · ∆)k

+ k(|b|? |ϕ|)(t + · ∆)k



∈ L

([0,∆]). (4.21)

Condition

(4.20)

holds by assumption (since

α ≤

2), so we only need to prove

(4.21)

If β = 1 so that b is summable, it follows from Jensen’s inequality that

(|b|? |ϕ|)(t)

≤ kbk

κ−1

s∈Z

|b(s)||ϕ(t + s∆)|

and thus

(

|b|? |ϕ|

)(

· ∆

)

≤ kbk

kϕ

(

· ∆

)

for any

κ ≥

1. Since

when

= 1, this shows that

(4.20)

implies

(4.21)

. Next if

β >

1, set

β/

(

β −

1). As in the

proof of Lemma 4.3 (replacing integrals by sums), we can use the Hölder inequality

to obtain the estimate

(|b|? |ϕ|)(t) ≤M

1/γ



s∈Z

|ϕ(t + s∆)|



1/β



s∈Z

|b(s)|

|ϕ(t + s∆)|



1/γ

for some constant

M >

0. By raising both sides to the

th power and exploiting the

periodicity of t 7→ kϕ(t + · ∆)k

, it follows that

k(|b|? |ϕ|)(t + · ∆)k

≤ M



s∈Z

|ϕ(t + s∆)|



γ/β

s∈Z

|b(s)|

u∈Z

|ϕ(t + (s + u)∆)|

= Mkbk

kϕ(t + · ∆)k

(4.22)

155

Paper F · Limit theorems for quadratic forms and related quantities of discretely sampled

continuous-time moving averages

for a suﬃciently large constant

M >

0. Since

γ ≤

(4.22)

and the assumption (

t 7→

kϕ

(

· ∆

)

∈ L

4/α

([0

,∆

]) show that (

t 7→ k

(

|b|? |ϕ|

)(

· ∆

)

∈ L

([0

,∆

]). To show

t 7→ k

(

|b| ? |ϕ|

)(

· ∆

)

∈ L

([0

,∆

]), we note that the assumption 2

/α

+ 1

/β ≥

ensures that we may choose

∗

∈

[

β,

2] such that 1

/α

/β

∗

= 3

2. Using the same type

of arguments as above, now with

∗

and

∗

= 2 instead of

and

, we obtain

the inequality

k(|b|? |ϕ|)(t + · ∆)k

≤ Mkbk

∗

kϕ(t + · ∆)k

Due to the fact that (

t 7→kϕ

(

· ∆

)

∈ L

4/α

([0

,∆

]), this shows that (

t 7→ k

(

|b|? |ϕ|

)(

· ∆)k

) ∈ L

([0,∆]) and, thus, ends the proof under statement (i).

In view of the above, to show the last part of the theorem (concerning statement

(ii)), it suﬃces to argue that if ϕ ∈ L

, then

B sup

t∈R

|t|

1−α/2

|ϕ(t)| < ∞ and c

B sup

t∈Z

|t|

1−β

|b(t)|< ∞

for some

α,β >

0 with

β <

2, then there exist

p, q ∈

2] such that 2

/q ≥

b ∈ `

and (t 7→ kϕ(t + · ∆)k

) ∈ L

([0,∆]) for κ ∈ {p,2}. To do so observe that

∈

2−α

< p ≤ 2,

1−β

< q ≤ 2

and hence we may (and do) ﬁx

p, q ∈

2] such that 2

+ 1

/q ≥

(

α/

−

< −

and q(β −1) < −1. With this choice it holds that b ∈ `

, since

kbk

≤ |b(0)|

+ 2c

∞

s=1

q(β−1)

< ∞.

We can use the same type of arguments as in the last part of the proof of Theorem 1.2

to conclude that (

t 7→ kϕ

(

· ∆

)

∈ L

4/κ

([0

,∆

]) for

κ ∈ {p,

}

. Indeed, in view of

the decomposition

(4.12)

(with

playing the role of

) and the fact that

ϕ ∈ L

, it

suﬃces to argue that

sup

t∈[0,∆]

∞

s=2

|ϕ

(

t ±s∆

)

< ∞

. However, this is clearly the case

as κ(α/2−1) ≤ p(α/2 −1) < −1 and, thus,

sup

t∈[0,∆]

∞

s=2

|ϕ(t + s∆)|

≤ c

∆

κ(α/2−1)

∞

s=1

κ(α/2−1)

< ∞.

This ends the proof of the result. 

Acknowledgments

The research was supported by the Danish Council for Independent Research (grant

DFF–4002–00003).

References

[1]

Avram, F. (1988). On bilinear forms in Gaussian random variables and Toeplitz

matrices. Probab. Theory Related Fields 79(1), 37–45. doi:

10.1007/BF00319101

[2]

Bai, S., M.S. Ginovyan and M.S. Taqqu (2016). Limit theorems for quadratic

forms of Lévy-driven continuous-time linear processes. Stochastic Process. Appl.

126(4), 1036–1065. doi: 10.1016/j.spa.2015.10.010.

156

References

[3]

Basse-O’Connor, A., M.S. Nielsen, J. Pedersen and V. Rohde (2017). A continu-

ous-time framework for ARMA processes. arXiv: 1704.08574v1.

[4]

Beran, J., Y. Feng, S. Ghosh and R. Kulik (2016). Long-Memory Processes. Springer.

[5]

Brandes, D.

P. and I.V. Curato (2018). On the sample autocovariance of a Lévy

driven moving average process when sampled at a renewal sequence. arXiv:

1804.02254.

[6]

Brockwell, P.J. (2001). Lévy-driven CARMA processes. Ann. Inst. Statist. Math.

53(1). Nonlinear non-Gaussian models and related ﬁltering methods (Tokyo,

2000), 113–124. doi: 10.1023/A:1017972605872.

[7]

Brockwell, P.J. and R.A. Davis (2006). Time series: theory and methods. Springer

Series in Statistics. Reprint of the second (1991) edition. Springer, New York.

[8]

Brockwell, P.J., R.A. Davis and Y. Yang (2011). Estimation for non-negative

Lévy-driven CARMA processes. J. Bus. Econom. Statist. 29(2), 250–259. doi:

10.1198/jbes.2010.08165.

[9]

Brockwell, P.J. and A. Lindner (2009). Existence and uniqueness of stationary

Lévy-driven CARMA processes. Stochastic Process. Appl. 119(8), 2660–2681.

doi: 10.1016/j.spa.2009.01.006.

[10]

Cohen, S. and A. Lindner (2013). A central limit theorem for the sample auto-

correlations of a Lévy driven continuous time moving average process. J. Statist.

Plann. Inference 143(8), 1295–1306. doi: 10.1016/j.jspi.2013.03.022.

[11]

Doukhan, P., G. Oppenheim and M.S. Taqqu, eds. (2003). Theory and applica-

tions of long-range dependence. Boston, MA: Birkhäuser Boston Inc.

[12]

Farré, M., M. Jolis and F. Utzet (2010). Multiple Stratonovich integral and

Hu-Meyer formula for Lévy processes. Ann. Probab. 38(6), 2136–2169. doi:

10.1214/10-AOP528.

[13]

Fox, R. and M.S. Taqqu (1985). Noncentral limit theorems for quadratic forms

in random variables having long-range dependence. Ann. Probab. 13(2), 428–

446.

[14]

Fox, R. and M.S. Taqqu (1987). Central limit theorems for quadratic forms in

random variables having long-range dependence. Probab. Theory Related Fields

74(2), 213–240. doi: 10.1007/BF00569990.

[15]

Giraitis, L. and D. Surgailis (1990). A central limit theorem for quadratic forms

in strongly dependent linear variables and its application to asymptotical

normality of Whittle’s estimate. Probab. Theory Related Fields 86(1), 87–104.

doi: 10.1007/BF01207515.

[16]

Giraitis, L., H.L. Koul and D. Surgailis (2012). Large sample inference for long

memory processes. Imperial College Press, London, xvi+577. doi:

10.1142/p591

[17]

Gushchin, A.A. and U. Küchler (2000). On stationary solutions of delay dif-

ferential equations driven by a Lévy process. Stochastic Process. Appl. 88(2),

195–211. doi: 10.1016/S0304-4149(99)00126-X.

157

Paper F · Limit theorems for quadratic forms and related quantities of discretely sampled

continuous-time moving averages

[18]

Hamilton, J.D. (1994). Time series analysis. Princeton University Press, Prince-

ton, NJ.

[19]

Küchler, U. and M. Sørensen (2013). Statistical inference for discrete-time

samples from aﬃne stochastic delay diﬀerential equations. Bernoulli 19(2),

409–425. doi: 10.3150/11-BEJ411.

[20]

Marquardt, T. (2006). Fractional Lévy processes with an application to long

memory moving average processes. Bernoulli 12(6), 1099–1126.

[21]

Marquardt, T. and R. Stelzer (2007). Multivariate CARMA processes. Stochastic

Process. Appl. 117(1), 96–120. doi: 10.1016/j.spa.2006.05.014.

[22]

Pipiras, V. and M.S. Taqqu (2017). Long-range dependence and self-similarity.

Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge

University Press.

[23]

Rajput, B.S. and J. Rosiński (1989). Spectral representations of inﬁnitely divisi-

ble processes. Probab. Theory Related Fields 82(3), 451–487.

[24]

Sato, K. (1999). Lévy Processes and Inﬁnitely Divisible Distributions. Vol. 68. Cam-

bridge Studies in Advanced Mathematics. Translated from the 1990 Japanese

original, Revised by the author. Cambridge University Press.

[25]

Spangenberg, F. (2015). Limit theorems for the sample autocovariance of a

continuous-time moving average process with long memory. arXiv:

1502.0485

158

P a p e r

On Non-Stationary Solutions to MSDDEs:

Representations and the Cointegration Space

Mikkel Slot Nielsen

Abstract

In this paper we study solutions to multivariate stochastic delay diﬀerential

equations (MSDDEs) and their relation to the discrete-time cointegrated VAR

model. In particular, we observe that an MSDDE can always be written in an

error correction form and, under suitable conditions, we argue that a process

with stationary increments is a solution to the MSDDE if and only if it admits

a certain Granger type representation. A direct implication of these results is a

complete characterization of the cointegration space. Finally, the relation between

MSDDEs and invertible multivariate CARMA equations is used to introduce the

cointegrated MCARMA processes.

MSC: 60G10; 60G12; 60H05; 60H10; 91G70

Keywords: Cointegration; Error correction form; Granger representation theorem; Multivari-

ate CARMA processes; Multivariate SDDEs; Non-stationary processes

1 Introduction and main results

Cointegration refers to the phenomenon that some linear combinations of non-

stationary time series are stationary. This concept goes at least back to Engle and

Granger [9] who used the notion of cointegration to formalize the idea of a long run

equilibrium between two or more non-stationary time series. Several models have

been shown to be able to embed this idea, and one of the most popular among them

is the VAR model:

= Γ

t−1

+ Γ

t−2

+ ···+ Γ

t−p

+ ε

, t ∈ Z. (1.1)

159

Paper G · On non-stationary solutions to MSDDEs: representations and the cointegration

space

Here (

)

t∈Z

is an

-dimensional, say, i.i.d. sequence with

[

] = 0 and

[

]

invertible, and

,... , Γ

∈ R

n×n

are

n ×n

matrices. If one is searching for a solution

(

)

t∈Z

which is only stationary in its diﬀerences,

∆X

B X

−X

t−1

, one often rephrases

(1.1) in error correction form

∆X

= Π

t−1

p−1

j=1

∆X

t−j

+ ε

, t ∈ Z, (1.2)

where

−I

j=1

and

−

k=j+1

. (Here

denotes the

n × n

identity

matrix.) Properties of solutions to

(1.1)

concerning existence, uniqueness and sta-

tionarity are determined by the characteristic polynomial

(

)

B I

−

j=1

. Let

be the rank of

−Γ

(1) and, if

r < n

, let

⊥

,β

⊥

∈ R

n×(n−r)

be matrices of rank

n −r

satisfying

⊥

= 0. Standard existence and uniqueness results for VAR

models and the Granger representation theorem yield the following:

Theorem 1.1.

Suppose that

detΓ

(

) = 0 implies

|z| >

1 or

= 1. Moreover, suppose either

that

, or

r < n

and (

⊥

)

(

−

p−1

j=1

)

⊥

is invertible. Then a process (

)

t∈Z

with

E[kX

] < ∞ and stationary diﬀerences is a solution to (1.1) if and only if

= ξ + C

j=1

j=−∞

C(t −j)ε

, t ∈ Z, (1.3)

where:

(i) ξ is a random vector satisfying E[kξk

] < ∞ and Π

ξ = 0.

(ii) C











0 if r = n

⊥

(α

⊥

)



−

p−1

j=1



⊥

−1

(α

⊥

)

if r < n

(iii) C

(

) is the

th coeﬃcient in the Taylor expansion of

z 7→ Γ

(

)

−1

−

−z

)

−1

= 1

for j ≥ 0.

(We use the conventions

j=1

= 0 and

j=1

−

j=t+1

when

t <

0, and

k · k

denotes the

Euclidean norm on

.) The representation

(1.3)

has several immediate consequences:

(i) any solution with stationary diﬀerences can be decomposed into an initial value,

a unique stationary part and a unique non-stationary part, (ii) if

the solution

is stationary and unique, and (iii) if

r < n

the process (

)

t∈Z

is stationary if and

only if

γ ∈ R

belongs to the row space of

−Γ

(1). In particular, cointegration is

present in the VAR model when

has rank

r ∈

), and the cointegration space is

spanned by the rows of

. There exists a massive literature on (cointegrated) VAR

models, which have been applied in various ﬁelds. We refer to [9, 13, 14, 15, 20, 24]

for further details.

In many ways, the multivariate stochastic delay diﬀerential equation (MSDDE)

−X

η ∗X(u) du + Z

−Z

, s < t, (1.4)

may be viewed as a continuous-time version of the (possibly inﬁnite order) VAR

equation

(1.1)

. Here

= [

,... , Z

]

t ∈ R

, is a Lévy process with

= 0 and

160

1 · Introduction and main results

[

]

< ∞

is an

n ×n

matrix such that each entry

is a signed measure on

[0,∞) satisfying

[0,∞)

δt

|η

|(dt) < ∞ (1.5)

for some

δ >

0, and

∗

denotes convolution. (For more on the notation used in this

paper, see Section 2.) Moreover, (

)

t∈R

will be required to satisfy

[

]

< ∞

and

be given such that (

)

t∈R

has stationary increments. The precise meaning of

(1.4)

is that

−X

j=1

[0,∞)

u−v

(dv) du + Z

−Z

, i = 1,...,n,

almost surely for any

s < t

. The model

(1.4)

results in the multivariate Ornstein-

Uhlenbeck process when choosing

Aδ

being the Dirac measure at 0 and

A ∈ R

n×n

, and stationary invertible multivariate CARMA (MCARMA) processes with

a non-trivial moving average component can be represented as an MSDDE with

inﬁnite delay. Stationary solutions to equations of the type

(1.4)

, MCARMA processes

and their relations have been studied in [4, 5, 12, 17, 18]. Similarly to

for the VAR

model, questions concerning solutions to (1.4) are tied to the function

(z) = zI

−

[0,∞)

−zt

η(dt), Re(z) > −δ. (1.6)

In particular, it was shown that if

deth

(

) = 0 implies

(

)

0, then the unique

stationary solution (X

)

t∈R

to (1.4) with E[kX

] < ∞ takes the form

−∞

C(t −u) dZ

, t ∈ R,

where C : [0,∞) →R

n×n

is characterized by its Laplace transform:

∞

−zt

C(t) dt = h

(z)

−1

, Re(z) ≥ 0.

It follows that this result is an analogue to Theorem 1.1 when

. To the best of our

knowledge, there is no literature on solutions to (1.4) which are non-stationary, and

hence no counterpart to Theorem 1.1 exists for the case r < n.

The main result of this paper is a complete analogue of Theorem 1.1. In the

following we will set

= η([0,∞)) and π(t) = η([0,t]) −η([0,∞)), t ≥ 0. (1.7)

In Proposition 3.1 we show that (1.4) admits the following error correction form:

−X

= Π

du +

∞

π(u)(X

t−u

−X

s−u

) du + Z

−Z

, s < t. (1.8)

To make

(1.8)

comparable to

(1.2)

, one can formally apply the derivative operator

to the equation and obtain

= Π

+ Π ∗(DX)(t)+ DZ

, t ∈ R, (1.9)

with

(

) =

(

)

. We can now formulate the counterpart to Theorem 1.1. In the

following,

refers to the rank of

and in case

r < n

, then

⊥

,β

⊥

∈ R

n×(n−r)

are

matrices of rank n −r which satisfy Π

⊥

= Π

⊥

= 0.

161

Paper G · On non-stationary solutions to MSDDEs: representations and the cointegration

space

Theorem 1.2.

Suppose that

deth

(

) = 0 implies

(

)

0 or

= 0. Moreover, suppose

either that the rank

, or strictly less than

and (

⊥

)

(

−Π

([0

,∞

)))

⊥

invertible. Then a process (X

)

t∈R

is a solution to (1.4) if and only if

= ξ + C

−∞

C(t −u) dZ

, t ∈ R, (1.10)

where the following holds:

(i) ξ is a random vector satisfying E[kξk

] < ∞ and Π

ξ = 0.

(ii) C











0 if r = n,

⊥

(α

⊥

)

−Π([0,∞)))β

⊥

−1

(α

⊥

)

if r < n.

(iii) C : [0,∞) →R

n×n

is characterized by

∞

−zt

C(t) dt = h

(z)

−1

−z

−1

, Re(z) ≥ 0.

Similarly to the VAR model, Theorem 1.2 shows that cointegration occurs in the

MSDDE model when

is of reduced rank

r ∈

), and the rows of

span the

cointegration space. It follows as well that we always have uniqueness up to the

discrepancy term

, and the restrictions on

depend ultimately on the rank of

Since an invertible MCARMA equation may be rephrased as an MSDDE, the notion of

cointegrated invertible MCARMA processes can be studied in the MSDDE framework

by relying on Theorem 1.2 (see Section 4 for details).

In Section 2 we will introduce some notation which will be used throughout the

paper, and which already has been used in the introduction. The purpose of Section 3

is to develop a general theory for non-stationary solutions to MSDDEs with stationary

increments, some of which will later be used to prove Theorem 1.2. In this section we

will also put some emphasis on the implications of the representation

(1.10)

, both in

terms of stationary properties and concrete examples. Section 4 discusses how one can

rely on the relation between invertible MCARMA equations and MSDDEs to deﬁne

cointegrated MCARMA processes. In particular, under conditions similar to those

imposed in [10, Theorem 4.6], we show existence and uniqueness of a cointegrated

solution to the MSDDE associated to the MCARMA(

p, p −

1) equation. This com-

plements the result of [10], which ensures existence of cointegrated MCARMA(

p, q

)

processes when

p > q

+ 1. Finally, Section 5 contains the proofs of all the statements

presented in the paper together with a few technical results.

2 Preliminaries

Let

= [

]

: R → C

m×k

be a measurable function and

= [

] a

k ×n

matrix where

each µ

is a measure on R. Then, provided that

(t)|µ

(dt) < ∞

162

3 · General results on existence, uniqueness and representations of solutions to MSDDEs

for l = 1,...,k, i = 1,. ..,m and j = 1,...,n, we set

f (t)µ(dt) =

l=1







(t)µ

(dt) ···

(t)µ

(dt)

(t)µ

(dt) ···

(t)µ

(dt)







. (2.1)

The integral

(

)

(

) is deﬁned in a similar manner when either

is one-

dimensional. Moreover, we will say that

is a signed measure if it takes the form

−µ

−

for two mutually singular measures

and

−

, where at least one of

them is ﬁnite. The deﬁnition of the integral (2.1) extends naturally to signed matrix

measures provided that the integrand is integrable with respect to the variation

measure

|µ| B µ

−

(simply referred to as being integrable with respect to

). For a

given point

t ∈ R

, if

(

t − ·

) is integrable with respect to

, we deﬁne the convolution

f ∗µ(t) =

f (t −u)µ(du).

For a measurable function

f : R → C

k×m

and

n ×k

signed matrix measure, if

(

t − ·

) is integrable with respect to

, we set

µ ∗f

(

)

(

∗µ

)(

)

. Also, if

a given signed matrix measure and

z ∈ C

is such that

−Re(z)t

|µ

(

)

< ∞

for all

and j, the (i, j)-th entry of the Laplace transform L[µ](z) of µ at z is deﬁned by

L[µ]

(z) =

−zt

(dt).

Eventually, if

|µ|

is ﬁnite, we will also use the notation

[

](

) =

[

](

y ∈ R

referring to the Fourier transform of

. When

(

) =

(

)

for some measurable

function f we write L[f ] and F [f ] instead.

Finally, a stochastic process

= [

,... , Y

]

t ∈ R

, is said to be stationary, respec-

tively have stationary increments, if the ﬁnite dimensional marginal distributions of

t+h

)

t∈R

, respectively (Y

t+h

−Y

)

t∈R

, do not depend on h ∈ R.

3 General results on existence, uniqueness and representations

of solutions to MSDDEs

Suppose that

= [

,... , Z

]

t ∈ R

, is an

-dimensional measurable process with

= 0, stationary increments and

[

]

< ∞

, and let

= [

] be a signed

n ×n

matrix measure which satisﬁes

(1.5)

for some

δ >

0. We will say that a stochastic

process

= [

,... , X

]

t ∈ R

, is a solution to the corresponding multivariate

stochastic delay diﬀerential equation (MSDDE) if it meets the following requirements:

(i) (X

)

t∈R

is measurable and E[kX

] < ∞ for all t ∈ R.

(ii) (X

)

t∈R

has stationary increments.

(iii) The relations

−X

j=1

[0,∞)

u−v

(dv) du + Z

−Z

, i = 1,...,n,

hold true almost surely for each s < t.

163

Paper G · On non-stationary solutions to MSDDEs: representations and the cointegration

space

As indicated in the introduction, (iii) may be compactly written as

= η ∗X(t) dt + dZ

, t ∈ R. (3.1)

We start with the observation that

(3.1)

can always be written in an error correction

form (as noted in (1.8)):

Proposition 3.1.

Let

∈ R

n×n

and

π :

,∞

)

→ R

n×n

be deﬁned by

(1.7)

, and suppose

that

δ >

0 is given such that

(1.5)

is satisﬁed. Then

sup

t≥0

εt

kπ

(

)

k < ∞

for all

ε < δ

, and

(3.1) can be written as

−X

= Π

du +

∞

π(u)(X

t−u

−X

s−u

) du + Z

−Z

, s < t, (3.2)

so if (X

)

t∈R

is a solution to (3.1), then (Π

)

t∈R

is stationary.

Remark 3.2.

Using the notation

(

) =

(

)

, we do the following observations in

relation to Proposition 3.1:

(i) If Π

is invertible, a solution (X

)

t∈R

must be stationary itself.

(ii)

= 0 the statement does not provide any further insight. Observe, however,

the equation

(3.2)

depends in this case only on the increments of (

)

t∈R

so a

solution needs not to be stationary in this case.

(iii)

If the rank

satisﬁes 0

< r < n

, there exist non-trivial linear combinations

of the entries of (X

)

t∈R

which are stationary

At this point we have not argued whether or not (

)

t∈R

can be stationary even when

r < n

and, ultimately, it depends on the structure of the noise process (

)

t∈R

. However,

it is not too diﬃcult to verify from Theorem 3.5 that if (

)

t∈R

is a Lévy process such

that

[

] is invertible and

A ∈ R

m×n

, (

)

t∈R

is stationary if and only if

BΠ

for some

B ∈ R

m×n

. In case of (iii), one often considers a rank factorization of

; that

is, one chooses

α,β ∈ R

n×r

of rank

such that

αβ

. In this way one can identify

the columns of

as cointegrating vectors spanning the cointegration space, and

as the adjustment matrix determining how deviations from a long run equilibrium

aﬀect short run dynamics. This type of intuition is well-known for the cointegrated

VAR models, so we refer to [9] for details.

In the following we will search for a solution to

(3.1)

. To this end, let

δ >

0 be

chosen such that

(1.5)

holds, set

B {z ∈ C

(

)

> −δ}

and deﬁne

: H

→ C

n×n

(z) = zI

−L[η](z), z ∈ H

. (3.3)

Since

is analytic on

and

|deth

(

)

| → ∞

|z| → ∞

z 7→ h

(

)

−1

is meromorphic

. Recall that if

is a pole of

z 7→ h

(

)

−1

, there exists

n ∈ N

such that

z 7→

(

z −z

)

(

)

−1

is analytic and non-zero in a neighborhood of

. If

= 1 the pole is

called simple.

Condition 3.3. For the function h

in (3.3) it holds that

(i) det(h

(z)) , 0 for all z ∈ H

\{0} and

164

3 · General results on existence, uniqueness and representations of solutions to MSDDEs

(ii) z 7−→h

(z)

−1

has either no poles at all or a simple pole at 0.

For convenience, we have chosen to work with Condition 3.3 rather than the as-

sumptions of Theorem 1.2. The following result shows that they are essentially the

same.

Proposition 3.4. Suppose that, for some ε > 0,

[0,∞)

εt

|η

|(dt) < ∞, i,j = 1,...,n.

The following two statements are equivalent:

(i) There exists δ ∈ (0, ε] such that (1.5) and Condition 3.3 are satisﬁed.

(ii) The assumptions of Theorem 1.2 hold true.

We will construct a solution (

)

t∈R

(3.1)

in a similar way as in [4], namely by apply-

ing a suitable ﬁlter (i.e., a ﬁnite signed

n×n

matrix measure)

to (

)

t∈R

. Theorem 3.5

reveals that the appropriate ﬁlter to apply is

(

) =

(

)

−f

(

)

for a suitable

function

f : R → R

n×n

. This result may be viewed as a Granger type representation

theorem for solutions to MSDDEs and as a general version of Theorem 1.2.

Theorem 3.5.

Suppose that Condition 3.3 holds. Then there exists a unique function

f : [0,∞) →R

n×n

satisfying

L[f ](z) = I

−zh

(z)

−1

, z ∈ H

, (3.4)

and the function

u 7→ f

(

)

t−u

belongs to

almost surely for each

t ∈ R

. Moreover, a

process (X

)

t∈R

is a solution to (3.1) if and only if

= ξ + C

∞

f (u)[Z

−Z

t−u

] du, t ∈ R, (3.5)

where Π

ξ = 0, E[kξk

] < ∞ and C

= I

−

∞

f (t) dt.

Concerning the function

of Theorem 3.5, it can also be obtained as a solution to a

certain multivariate delay diﬀerential equation; we refer to Lemma 5.1 for more on

its properties.

Remark 3.6. Let the situation be as described in Theorem 3.5 and note that

= I

−L[f ](0) = zh

(z)

−1



z=0

Hence, if the rank

is equal to

we have that

= 0, and if

is strictly

less than

can be computed by the residue formula given in [23]. Speciﬁcally,

⊥

[(

⊥

)

(

−Π

([0

,∞

)))

⊥

]

−1

(

⊥

)

, where

⊥

,β

⊥

∈ R

n×(n−r)

are matrices of rank

n −r

satisfying

⊥

= 0 (note that the inverse matrix in the expression of

does indeed exist by Proposition 3.4).

165

Paper G · On non-stationary solutions to MSDDEs: representations and the cointegration

space

In the special case where

z 7→ h

(

)

−1

has no poles at all, it was shown in [4,

Theorem 3.1] that there exists a unique stationary solution to

(3.1)

. The same con-

clusion can be reached by Theorem 3.5 using that

is invertible. Indeed, in this

case any solution is stationary,

= 0 and

= 0 (the ﬁrst two implications follow

from Remarks 3.2 and 3.6). While there exist several solutions when

is singular,

Theorem 3.5 shows that any two solutions always have the same increments. The

term

reﬂects how much solutions may diﬀer and its possible values are determined

by the relation Π

ξ = 0.

In view of Proposition 3.4 and Remark 3.6, Theorem 1.2 is an obvious consequence

of Theorem 3.5 if

∞

f (u)[Z

−Z

t−u

] dt =

−∞

C(t −u) dZ

, t ∈ R. (3.6)

Clearly, the right-hand side of

(3.6)

requires that we can deﬁne integration with

respect to (

)

t∈R

. Although this is indeed possible if (

)

t∈R

is a Lévy process (for

instance, in the sense of [19]), we will here put the less restrictive assumption that

)

t∈R

is a regular integrator as deﬁned in [4, Proposition 4.1]:

Corollary 3.7.

Suppose that Condition 3.3 holds. Assume also that, for each

= 1

,... , n

there exists a linear map I

: L

∩L

→ L

(P) which satisﬁes

(i) I

(s,t]

) = Z

−Z

for all s < t, and

(ii) for all ﬁnite measures µ on R with

|r|µ(dr) < ∞,



(t − · )µ(dr)



(t − · ))µ(dr), t ∈ R,

where f

= 1

[0,∞)

( · −r) −1

[0,∞)

Then the statement of Theorem 1.2 holds true with



−∞

C(t −u) dZ



j=1

(t − · )), i = 1,...,n. (3.7)

In Theorem 1.2 the function

is characterized through its Laplace transform

[

], but one can also obtain it as a solution to a certain multivariate delay diﬀerential

equation. This follows by using the similar characterization given for

in Lemma 5.1;

the details are discussed in Remark 5.3. It should also be stressed that the conditions

for being a regular integrator (i.e., for I

,... , I

to exist) are mild; many semimartin-

gales with stationary increments (in particular, Lévy processes) and fractional Lévy

processes, as studied in [16], are regular integrators. For more on regular integrators,

see [4, Section 4.1].

Remark 3.8.

Suppose that Condition 3.3 is satisﬁed, let (

)

t∈R

be a regular integrator,

and let (

)

t∈R

be a solution to

(3.1)

. Since

= 0 and

= 0 (the latter by

Remark 3.6), Corollary 3.7 implies that the stationary process (

)

t∈R

is unique

and given by

= Π

−∞

C(t −u) dZ

, t ∈ R. (3.8)

If (

)

t∈R

is not a regular integrator one can instead rely on Theorem 3.5 to replace

−∞

C(t −u) dZ

∞

f (u)[Z

−Z

t−u

] du in (3.8).

166

3 · General results on existence, uniqueness and representations of solutions to MSDDEs

We end this section by giving two examples. In both examples we suppose for conve-

nience that (Z

)

t∈R

is a regular integrator.

Example 3.9 (The univariate case).

Consider the case where

= 1 and

is a mea-

sure which admits an exponential moment in the sense of

(1.5)

and satisﬁes

(

)

for all

z ∈ H

}

. In this setup Condition 3.3 can be satisﬁed in two ways which

ultimately determine the class of solutions characterized in Corollary 3.7:

(i) If Π

, 0. In this case, the solution to (3.1) is unique and given by

−∞

C(t −u) dZ

, t ∈ R,

where

[

](

) = 1

(

) for

z ∈ C

with

(

)

≥

0. This is consistent with the

literature on stationary solutions to univariate SDDEs (see [3, 12]).

(ii)

= 0 and

([0

,∞

))

1. In this case, a process (

)

t∈R

is a solution to

(3.1)

and only if

= ξ + (1 −Π([0,∞)))Z

−∞

C(t −u) dZ

, t ∈ R,

where

can be any random variable with

[

]

< ∞

and

[

](

) = 1

(

)

−

Π([0,∞)))/z for z ∈ C with Re(z) ≥ 0.

Suppose that we are in case (ii) and ﬁx

h >

0. Using the notation

∆

B Y

−Y

t−h

, it

follows from Proposition 3.1 that (∆

)

t∈R

is a stationary solution to the equation

∞

t−u

Π(du) + ∆

, t ∈ R. (3.9)

Existence and uniqueness of stationary solutions to equations of the type

(3.9)

were

studied in [3, Section 3] (when (

∆

)

t∈R

is a suitable Lévy-driven moving average),

and it was shown how these sometimes can be used to construct stationary increment

solutions to univariate SDDEs.

Example 3.10 (Ornstein–Uhlenbeck).

Suppose that

Aδ

for some

A ∈ R

n×n

, for

which its spectrum σ(A) satisﬁes

σ(A) \{0} ⊆ {z ∈ C : Re(z) < 0}. (3.10)

With this speciﬁcation, the MSDDE (3.1) reads

= AX

dt + dZ

, t ∈ R. (3.11)

Under the assumption (3.10) we have that

(z)

−1

∞

(A−I

z)t

dt = L

t 7−→ 1

[0,∞)

(t)e

(z), Re(z) > 0.

Since the set of zeroes of

coincides with

(

), it follows immediately that Condi-

tion 3.3 is satisﬁed for some

δ >

0 if 0

< σ

(

). This is the stationary case where the

solution to (3.11) takes the well-known form

−∞

A(t−u)

, t ∈ R.

167

Paper G · On non-stationary solutions to MSDDEs: representations and the cointegration

space

If instead 0

∈ σ

(

), let

r < n

be the rank of

and choose

⊥

,β

⊥

∈ R

n×(n−r)

of rank

n−r

such that

⊥

Aβ

⊥

= 0. We can now rely on Proposition 3.4 and the observation

that

Π ≡

0 to conclude that Condition 3.3 is satisﬁed if (

⊥

)

⊥

is invertible. This is

the cointegrated case where the solution takes the form

= ξ + C

−∞

A(t−u)

−C

, t ∈ R,

with

[

kξk

]

< ∞

Aξ

= 0 and

⊥

[(

⊥

)

⊥

]

−1

(

⊥

)

. In particular, the stationary

process (AX

)

t∈R

takes the form

−∞

A(t−u)

, t ∈ R.

Stationary Ornstein-Uhlenbeck processes have been widely studied in the litera-

ture (see, e.g., [1, 21, 22]). Cointegrated solutions to

(3.11)

have also received some

attention, for instance, in [6].

4 Cointegrated multivariate CARMA processes

In [4, Theorem 4.8] it was shown that any stationary MCARMA process satisfying

a certain invertibility assumption can be characterized as the unique solution to a

suitable MSDDE. This may be viewed as the continuous-time analogue of representing

a discrete-time ARMA process as an inﬁnite order AR equation. In this section we will

rely on this idea and the results obtained in Section 3 to deﬁne cointegrated MCARMA

processes. The focus will only be on MCARMA(

p, p −

1) processes for a given

p ∈ N

However, the analysis should also be doable for MCARMA(

p, q

) processes for a general

q ∈ N

with

q < p

by extending the theory developed in the former sections to higher

order MSDDEs. This was done in [4] in the stationary case. For convenience we will

also assume that (Z

)

t∈R

is a regular integrator in the sense of Corollary 3.7.

We start by introducing some notation. Deﬁne P ,Q: C → C

n×n

P (z) = I

+ P

p−1

+ ···+ P

and Q(z) = I

p−1

+ Q

p−2

+ ···+ Q

p−1

for

,... , P

,... , Q

p−1

∈ R

n×n

. Essentially, any deﬁnition of the MCARMA process

)

t∈R

aims at rigorously deﬁning the solution to the formal diﬀerential equation

P (D)X

= Q(D)DZ

, t ∈ R. (4.1)

Since

(

)

for any random vector

, one should only expect solutions to

be unique up to translations belonging to the null space of

. To solve

(4.1)

it is

only necessary to impose assumptions on

, but since we will be interested in an

autoregressive representation of the equation, we will also impose an invertibility

assumption on Q:

Condition 4.1 (Stationary case). If detP (z) = 0 or detQ(z) = 0, then Re(z) < 0.

Under Condition 4.1 it was noted in [17, Remark 3.23] that one can ﬁnd

g :

,∞

)

→

n×n

which belongs to L

∩L

with

F [g](y) = P (iy)

−1

Q(iy), y ∈ R. (4.2)

168

4 · Cointegrated multivariate CARMA processes

Consequently, by heuristically applying the Fourier transform to

(4.1)

and rearranging

terms, one arrives at the conclusion

−∞

g(t −u) dZ

, t ∈ R. (4.3)

As should be the case, any deﬁnition used in the literature results in this process

(although (

)

t∈R

is sometimes restricted to being a Lévy process). In Proposition 4.2

we state two characterizations without proofs; these are consequences of [17, Deﬁni-

tion 3.20] and [4, Theorem 4.8], respectively.

Proposition 4.2.

Suppose that Condition 4.1 is satisﬁed and let (

)

t∈R

be deﬁned by

(4.2)–(4.3).

(i)

Choose

,... , B

∈ R

n×n

such that

z 7→ P

(

)[

p−1

···

]

−Q

(

)

is at most

of order p −1, and set

A =







0 I

0 ··· 0

0 0 I

··· 0

0 0 ··· 0 I

−P

p−1

··· −P

−P







and B =













Then

, where

= [

,... ,

∈ R

np×n

and (

)

t∈R

is the unique station-

ary process satisfying

= AG

dt + B dZ

, t ∈ R.

(ii) Set η

= Q

−P

and let η

: [0,∞) → R

n×n

be characterized by

F [η

](y) = I

iy −η

−Q(iy)

−1

P (iy), y ∈ R. (4.4)

Then (X

)

t∈R

is the unique stationary process satisfying

= η

dt +

∞

(u)X

t−u

du dt + dZ

, t ∈ R.

It follows from Proposition 4.2 that (

)

t∈R

can either be deﬁned in terms of a

state-space model using the triple (A,B, C) or by an MSDDE of the form (3.1) with

η(dt) = η

(dt) + η

(t) dt. (4.5)

While (

)

t∈R

given by

(4.3)

is stationary by deﬁnition, it does indeed make sense to

search for non-stationary, but cointegrated, processes satisfying (i) or (ii) of Propo-

sition 4.2 also when Condition 4.1 does not hold. Fasen-Hartmann and Scholz [10]

follow this idea by ﬁrst characterizing cointegrated solutions to state-space equations

and, next, deﬁne the cointegrated MCARMA process as a cointegrated solution corre-

sponding to the speciﬁc triple (

A,B,C

). Their deﬁnition applies to any MCARMA(

p, q

)

process and they give suﬃcient conditions on

and

for the cointegrated MCARMA

process to exist when

q < p −

1. We will use the results from the former sections to

deﬁne the cointegrated MCARMA(p,p −1) process as the solution to an MSDDE.

169

Paper G · On non-stationary solutions to MSDDEs: representations and the cointegration

space

Condition 4.3 (Cointegrated case). The following statements are true:

(i) If detP (z) = 0, then either Re(z) < 0 or z = 0.

(ii) The rank r of P (0) = P

is reduced r ∈ (0,n).

(iii)

The matrix (

⊥

)

p−1

⊥

is invertible, where

⊥

,β

⊥

∈ R

n×(n−r)

are of rank

n−r

and

satisfy P

⊥

= P

⊥

= 0.

(iv) If detQ(z) = 0 then Re(z) < 0.

The assumptions (i)–(iii) of Condition 4.3 are also imposed in [10], and (iv) is im-

posed to ensure that

(4.1)

admits an MSDDE representation. In [10] they impose

an additional assumption, namely that the polynomials

and

are so-called left

coprime, which is used to ensure that the pole of

z 7→ P

(

)

−1

at 0 is also a pole of

z 7→ P (z)

−1

Q(z). However, in our case this is implied by (iv).

Theorem 4.4.

Suppose that Condition 4.3 holds. Then the measure in

(4.5)

is well-

deﬁned and satisﬁes

(1.5)

as well as Condition 3.3 for a suitable

δ >

0, and the rank of

([0

,∞

)) is

. In particular, a process (

)

t∈R

is a solution to the corresponding

MSDDE if and only if

= ξ + C

−∞

C(t −u) dZ

, t ∈ R, (4.6)

where E[kξk

] < ∞, P

ξ = 0, C

= β

⊥

[(α

⊥

)

p−1

⊥

]

−1

(α

⊥

)

p−1

and

L[C](z) = P (z)

−1

Q(z) −z

−1

, Re(z) ≥ 0.

Remark 4.5.

Suppose that Condition 4.3 is satisﬁed and deﬁne

(4.5)

. In this case,

Theorem 4.4 shows that (

)

t∈R

given by

(4.6)

deﬁnes a solution to the corresponding

MSDDE. As noted right after the formal CARMA equation

(4.1)

, the initial value

should not aﬀect whether (

)

t∈R

can be thought of as a solution (since

= 0).

Hence, suppose that ξ = 0. By heuristically computing F [X] from (4.6) we obtain

F [X](y) = (iy)

−1

F [DZ](y)+ F [C](y)F [DZ](y) = P (iy)

−1

Q(iy)F [DZ](y)

for y ∈ R which, by multiplication of P (iy), shows that (X

)

t∈R

solves (4.1).

5 Proofs

Proof of Proposition 3.1. We start by arguing that

sup

t≥0

εt

kπ(t)k< ∞ (5.1)

for a given

ε ∈

,δ

). Note that, for any given ﬁnite signed matrix-valued measure

on [0,∞),

L[µ](z) =

[0,∞)

−zu

µ(du) = z

∞

−zu

µ([0,u]) du (5.2)

for all z ∈ C with Re(z) > 0 using integration by parts. Consequently,

L[π](z) = z

−1

η](z), Re(z) > 0, (5.3)

170

5 · Proofs

using the notation

η −Π

. On the other hand,

z 7→ L

[

](

) is analytic on

(by

(1.5)

), and since

[

](0) =

[

](0)

−η

([0

,∞

)) = 0,

z 7→ z

−1

[

](

) is also analytic on

, and we deduce that

C B sup

Re(z)≥−

kL[

η](z)k+ sup

Re(z)≥−



−1

η](z)



< ∞

for an arbitrary

ε ∈ (ε,δ). Hence, we ﬁnd that

sup

Re(z)>−



−1

η](z)



dIm(z) ≤



2 +

[−1,1]

−2



< ∞,

and it follows by [3, Lemma 4.1] (or a slight modiﬁcation of [8, Theorem 1 (Sec-

tion 3.4)]) and

(5.3)

that

t 7→ e

εt

kπ

(

)

belongs to

. For

(5.1)

to be satisﬁed it suﬃces

to argue that

sup

t≥0

εt

|π

(

)

| < ∞

, where

refers to an arbitrarily chosen entry of

π. Using integration by parts we ﬁnd that

εt

|π

(t)| ≤ |π

(0)|+

∞

εu

|(du) + ε

∞

εu

|π

(u)| du. (5.4)

It is clear that the ﬁrst term on the right-hand side of

(5.4)

is ﬁnite, and the same

holds for the second term by

(1.5)

. For the last term we use the Cauchy–Schwarz

inequality and the fact that (u 7→ e

εu

(u)) ∈L

to deduce



∞

εu

|π

(u)| du



≤

∞

−2(

ε−ε)u

∞



εu

(u)



du < ∞

and this ultimately allows us to conclude that

(5.1)

holds. To show

(3.2)

it suﬃces to

argue that



∗

(u) du



∞

π(u)[X

t−u

−X

s−u

] du (5.5)

almost surely for each

s < t

. Using that

coincides with the Lebesgue–Stieltjes

measure of

, together with integration by parts on the functions

v 7→ π

(

) and

v 7→

t−v

s−v

du, we obtain



∗

(u) du



= lim

N→∞



[0,N]



t−v

s−v



(dv)



= lim

N→∞



π(N )

t−N

s−N

du +

π(u)[X

t−u

−X

s−u

] du



(5.6)

By [2, Corollary A.3], since (

)

t∈R

has stationary increments and

[

]

< ∞

, there

exist

α,β >

0 such that

[

]

≤ α

β|u|

for all

u ∈ R

. Consequently, we may as well

ﬁnd α

∗

,β

∗

> 0 (depending on s and t) which satisfy





t−N

s−N





≤ α

∗

+ β

∗

From this inequality, and due to

(5.1)

, each entry of

(

)

t−N

s−N

converges to 0

in L

(P) as N → ∞. The same type of reasoning gives that



∞

kπ(u)(X

t−u

−X

s−u

)k du



< ∞,

171

Paper G · On non-stationary solutions to MSDDEs: representations and the cointegration

space

showing that each entry of

u 7→ π

(

)(

t−u

−X

s−u

) is almost surely integrable with

respect to the Lebesgue measure and, hence,

(5.6)

implies

(5.5)

. Finally, we need to

argue that if (

)

t∈R

is a solution to

(3.1)

, (

)

t∈R

is stationary. Since (

)

t∈R

has

stationary increments it follows immediately from

(3.2)

that

B λ

−1

t+λ

t ∈ R

, is a stationary process for any

λ >

0. Since (

)

t∈R

has stationary increments and

[

]

< ∞

, it is continuous in

(

) (see [2, Corollary A.3]), and hence

converges

(

) as

λ ↓

0 for any

t ∈ R

. Consequently, (

)

t∈R

is stationary as well,

and this ﬁnishes the proof. 

Proof of Proposition 3.4.

Assume that we are in case (i). If

z 7→ h

(

)

−1

has no poles

at all, then

deth

(

) = 0 implies

(

)

0 and the rank of

, and thus case (ii)

is satisﬁed as well. If

z 7→ h

(

)

−1

has a simple pole at 0, the rank

−h

(0)

is strictly less than

, and the residue formula in [23] implies that (

⊥

)

Mβ

⊥

invertible, where

M B

(z) + η([0,∞))



z=0

= I

−Π([0,∞))

is the derivative of

at 0, and

⊥

,β

⊥

∈ R

n×(n−r)

are any two matrices of rank

n −r

satisfying

⊥

= 0. Conversely, if we are in case (ii), the facts that the

zeroes of

z 7→ deth

(

) are isolated points in

{z ∈ C

(

)

> −ε}

and

|deth

(

)

| ,

0 for

|z|

suﬃciently large ensure the existence of a

δ ∈

,ε

] such that

deth

(

)

0 for all

z ∈ H

}

. If the rank

z 7→ h

(

)

−1

has no poles at all on

, and if

r < n

and (

⊥

)

Mβ

⊥

is invertible, the residue formula in [23] implies that

z 7→ h

(

)

−1

has

a simple pole at 0. 

We will now turn to the construction of a solution to

(3.1)

. Lemma 5.1 concerns the

existence of the function f introduced in Theorem 3.5 and its properties.

Lemma 5.1.

Suppose that Condition 3.3 holds. Then there exists a unique function

f : R → R

n×n

enjoying the following properties:

(i) sup

t≥0

εt

kf (t)k < ∞ for all ε < δ.

(ii) L[f ](z) = I

−zh

(z)

−1

for all z ∈ H

(iii) f (t) = 0 for t < 0 and f (t) =

f ∗η(u) du −η([0,t]) for t ≥ 0.

Proof.

First note that, by assumption,

z 7→ I

−zh

(

)

−1

is an analytic function on

For any ε ∈ (0, δ) we will argue that

sup

Re(z)>−ε



−zh

(z)

−1



dIm(z) < ∞. (5.7)

If this is the case, a slight extension of the characterization of Hardy spaces (see

[3, Lemma 4.1] or [8, Theorem 1 (Section 3.4)]) ensures the existence of a function

f : R → C

n×n

, vanishing on (

−∞,

0), such that each entry of

t 7→ e

εt

(

) belongs to

and

[

](

) =

−zh

(

)

−1

for all

z ∈ C

with

(

)

> −ε

. Since

was arbitrary and, by

uniqueness of the Laplace transform, the relation holds true for all

z ∈ H

. Moreover,

since

F [f ](−y)

[

](

) for all

y ∈ R

(

denoting the complex conjugate of

z ∈ C

takes values in R

n×n

. To show (5.7) observe initially that

B sup

Re(z)≥−ε

kL[η](z)k < ∞,

172

5 · Proofs

since e

εt

|η

|(dt) is a ﬁnite measure for all i,j = 1,...,n. The same fact ensures that

(i) the absolute value of the determinant of h

(z) behaves as |z|

as |z| → ∞, and

(ii)

the dominating cofactors of

(

) as

|z| → ∞

are those on the diagonal (the (

i,i

th cofactor, i = 1,. ..,n) and their absolute values behave as |z|

n−1

as |z| → ∞.

In particular, kh

(z)

−1

k behaves as |z|

−1

as |z| → ∞ and, hence,

B sup

Re(z)≥−ε



(z)

−1



< ∞. (5.8)

Consequently, for any z ∈ C with Re(z) ≥−ε,

[−1,1]



−zh

(z)

−1



dIm(z) ≤ 2(

√

n + C

)

and

[−1,1]



−zh

(z)

−1



dIm(z) ≤ C

[−1,1]



(z)

−1



dIm(z)

≤ (C

)

[−1,1]

|x|

−2

using that

−zh

(

)

−1

−h

(

)

−1

[

](

) and that

k · k

is a submultiplicative norm.

This veriﬁes

(5.7)

and, hence, proves the existence of a function

f : R → R

n×n

with

(

) = 0 for

t <

0 and

[

](

) =

−zh

(

)

−1

for

z ∈ H

(in particular, verifying (ii)).

To show (iii), note that

L[f ∗η](z) −L[η](z) = −zh

(z)

−1

L[η](z) = zL[f ](z), z ∈ H

. (5.9)

By using the observation in

(5.2)

on the measures

f ∗η

(

)

and

together with

(5.9) we establish that

f (t) =

f ∗η(u) du −η([0,t]) (5.10)

for almost all

t ≥

0. Since we can choose

to satisfy

(5.10)

for all

t ≥

0 without

modifying its Laplace transform, we have established (iii). By the càdlàg property of

, the uniqueness part follows as well. Finally, we need to argue that (i) holds, and

for this it suﬃces to argue that

sup

t≥0

εt

(

)

| < ∞

for all

ε ∈

,δ

) where

refers

to an arbitrarily chosen entry of

. From

(5.10)

it follows that the Lebesgue–Stieltjes

measure of

is given by

k=1

∗η

(

)

dt −η

(

). Therefore, integration by parts

yields

εt

(t)| ≤ |f

(0)|+

k=1

∞

εu

|∗|η

|(u) du +

∞

εu

|η

|(du)

+ ε

∞

εu

(u)| du,

(5.11)

so to prove the result we only need to argue that each term on right-hand side of

(5.11)

is ﬁnite. The assumption

(1.5)

implies immediately that

∞

εu

|η

(

)

< ∞

173

Paper G · On non-stationary solutions to MSDDEs: representations and the cointegration

space

As noted in the beginning of the proof,

u 7→ e

(

) belongs to

for an arbitrary

∈ (0,δ). In particular, for ε

∈ (ε,δ),

∞

εu

(u)| du ≤



∞

−2(ε

−ε)u

∞



(u)





1/2

< ∞.

Finally, since

∞

εu

|∗|η

|(u) du =

[0,∞)

εu

|η

|(du)

∞

εu

(u)| du,

it follows by the former arguments that this term is ﬁnite as well, and this concludes

the proof. 

Remark 5.2.

Suppose that

deth

(

)

0 for all

z ∈ H

so that Condition 3.3 is sat-

isﬁed and

z 7→ h

(

)

−1

has no poles. Under this assumption it was argued in [4,

Proposition 5.1] that there exists a function

g : R → R

n×n

, which is vanishing on

(

−∞,

0), is absolutely continuous on [0

,∞

) and decays exponentially fast at

∞

, such

that

[

](

) =

(

)

−1

for

z ∈ H

. Since property (ii) implies

[

](

) =

−h

(

)

−1

[

](

it must be the case that f = −g ∗η.

Proof of Theorem 3.5.

The existence of

is covered by Lemma 5.1. According to [2,

Corollary A.3] and by equivalence of matrix norms, we may choose

α,β,γ >

0 such

that

[

]

≤ α

β|t|

for all

t ∈ R

and

i,j=1

| ≤ γkAk

for all

= [

]

∈ R

n×n

. Using

this together with property Lemma 5.1(i), we obtain that



∞

kf (u)Z

t−u

k du



≤ (α + β|t|)γ

∞

kf (u)kdu + βγ

∞

kf (u)k|u| du < ∞.

In particular, this shows that

u 7→ f

(

)

t−u

belongs to

almost surely and, hence,

(

)

t∈R

given by

(3.5)

is a well-deﬁned process. We will now split the proof in two

parts: ﬁrst, we argue that (

)

t∈R

given by

(3.5)

is indeed a solution to

(3.1)

(existence)

and, next, we show that any other solution necessarily admits this representation

(uniqueness).

Existence: Note that

[

]

≤ γ

for all

and suitable

,γ

0 by [2,

Corollary A.3], so we may use similar reasoning as above to deduce that

[

]

< ∞

for all

t ∈ R

. Moreover, since (

)

t∈R

solves

(3.1)

if and only if it solves

(3.2)

, we may

and do assume ξ = 0 so that

= Z

−

∞

f (u)Z

t−u

du, t ∈ R.

To show that (X

)

t∈R

satisﬁes (3.1), we need to argue that

−X

−(Z

−Z

) =

η ∗X(u) du, s < t. (5.12)

To this end, note that

−X

−(Z

−Z

) =

η((s −u,t −u])Z

du −

t−u

s−u

f ∗η(v) dv Z

du (5.13)

174

5 · Proofs

and

η ∗X(u) du

η((s −u,t −u])X

η((s −u,t −u])Z

du −

η((s −u,t −u])

f (v)Z

u−v

dv du

(5.14)

using Lemma 5.1(iii) and

(3.5)

, respectively. Moreover, by comparing their Laplace

transforms, one can verify that η ∗f B (f

∗η

)

= f ∗η and, thus,

t−u

s−u

f ∗η(v) dv Z

du =

η((s −u −v,t −u −v])f (v) dv Z

η((s −u,t −u])

f (v)Z

u−v

dv du

(5.15)

It follows by combining

(5.13)

–

(5.15)

that

(5.12)

is satisﬁed. Recall that, for (

)

t∈R

to be a solution, we need to argue that (

)

t∈R

has stationary increments. However,

since

t+h

−X

= (Z

t+h

−Z

) −

∞

f (u)[(Z

t−u+h

−Z

) −(Z

−u+h

−Z

)] du, t ∈ R,

and the distribution of (

t+h

−Z

)

t∈R

does not depend on

, it follows that the dis-

tribution of (

t+h

−X

t+h

−Z

)

t∈R

does not depend on

. A rigorous argument can

be carried out by approximating the above Lebesgue integral by Riemann sums in

(

); since this procedure is similar to the one used in the proof of [4, Theorem 3.1],

we omit the details here.

Uniqueness: Suppose that (

)

t∈R

satisﬁes

(3.1)

[

]

< ∞

for all

t ∈ R

, and

(

)

t∈R

has stationary increments. In addition, suppose for the moment that we

have already shown that

−Y

= X

−X

, s,t ∈ R. (5.16)

Then it follows from

(3.2)

that

B λ

−1

(

−X

)

= 0 almost surely for all

λ >

0. On the other hand, since (

)

t∈R

and (

)

t∈R

have stationary increments, they

are continuous in

(

) and, hence,

→ Π

(

−X

) in

(

) as

λ ↓

0. This shows

that

−X

belongs to the null space of

almost surely and, consequently, (

)

t∈R

is necessarily of the form

(3.5)

. The remaining part of the proof concerns showing

(5.16)

or, equivalently, the process

∆

B Y

−Y

t−h

t ∈ R

, is unique for any

h >

We will rely on the same type of ideas as in the proof of [6, Proposition 7] and [10,

Proposition 4.5]. Suppose ﬁrst that

has reduced rank

r ∈

) and let

α,β ∈ R

n×r

be a rank decomposition of

as in Remark 3.2. Moreover, let

⊥

,β

⊥

∈ R

n×(n−r)

matrices of rank

n −r

such that

⊥

= 0. Then it follows from Theorem 3.1

that

∆

= α

αβ

t−u

du + α

∞

π(u)∆

t−u

du + α

∆

and (α

⊥

)

∆

= (α

⊥

)

∞

π(u)∆

t−u

du + (α

⊥

)

∆

(5.17)

175

Paper G · On non-stationary solutions to MSDDEs: representations and the cointegration

space

for each t ∈ R. Deﬁne the stationary processes

= (β

β)

−1

and V

= ((β

⊥

)

⊥

)

−1

(β

⊥

)

∆

, t ∈ R.

By using that ∆

= β∆

+ β

⊥

and rearranging terms, (5.17) can be written as

µ ∗

, t ∈ R, (5.18)

where

µ =







(δ

−δ

−(αβ

(0,h]

+ ∆

π) ·λ

β α

[δ

−π ·λ]β

⊥

(α

⊥

)

[(δ

−δ

−(∆

π) ·λ]β (α

⊥

)

[δ

−π ·λ]β

⊥







and

= [

∆

α, ∆

⊥

]

. (For brevity, we have used the notation

f · λ

(

) =

f (u) du.) Now, note that the Fourier transform F [µ] of µ takes the form

F [µ](y)







(1 −e

−ihy

)[I

−F [π](y)] −αβ

F [1

(0,h]

](y)

β α

−F [π](y)]β

⊥

(α

⊥

)

(1 −e

−ihy

)[I

−F [π](y)]β (α

⊥

)

−F [π](y)]β

⊥







In particular, it follows that

detF [µ](0) = det

−α

αβ

βh α

−F [π](0)]β

⊥

0 (α

⊥

)

−F [π](0)]β

⊥

= (−h)

det(α

α)det(β

β)det



(α

⊥

)

−Π([0,∞))]β

⊥



which is non-zero by Proposition 3.4. Consequently, it follows from

(5.18)

that the

means of (

)

t∈R

and (

)

t∈R

are uniquely determined by the one of (

)

t∈R

; namely

[

]

[

]

([0

,∞

))

−1

[

]. For this reason we may without loss of general-

ity assume that (

)

t∈R

, (

)

t∈R

and (

)

t∈R

are all zero mean processes so that they

admit spectral representations. Recall that the spectral representation of a stationary,

square integrable and zero mean process (

)

t∈R

is given by

ity

(

t ∈ R

where (

(

))

t∈R

is a complex-valued spectral process which is square integrable and

continuous in

(

), and which has orthogonal increments. (Integration with respect

can be deﬁned as in [11, pp. 388–390] for all functions in

(

being

the spectral distribution of (

)

t∈R

.) Consequently, by letting

and

be the

spectral processes corresponding to (

)

t∈R

, (

)

t∈R

and (

)

t∈R

, equation

(5.18)

can

be rephrased as

ity

F [µ](y)

(dy) =

ity

(dy), t ∈ R. (5.19)

Here we have used a stochastic Fubini result for spectral processes, e.g., [7, Proposi-

tion A.1]. Since the functions

y 7→ e

ity

t ∈ R

, are dense in

(

) for any ﬁnite measure

(cf. [25, p. 150]), the relation

(5.19)

remains true when

y 7→ e

ity

is replaced by any

measurable and, say, bounded function g : R → C

n×n

. In particular, we will choose

g(y) = e

ity

(iy)h

(iy)

−1

(α

⊥

)

−1

, y , 0,

176

5 · Proofs

and

(0) = [0

n×r

, β

⊥

]

[

](0)

−1

. Note that by

(5.8)

is indeed bounded. After observ-

ing that

F [µ](y) =

(α

⊥

)







(1 −e

−ihy

)

−F [π](y) + (iy)

−1

αβ

−F [π](y)

(1 −e

−ihy

)[I

−F [π](y)] I

−F [π](y)







β β

⊥

= (iy)

−1

(α

⊥

)

(iy)

(1 −e

−ihy

)β β

⊥

for

y ,

0, it is easy to verify that

(

)

[

](

) = [

(

ity

−e

i(t−h)y

)

, β

⊥

ity

] for all

y ∈ R

Consequently, it follows from (5.19) that

∆

β(e

ity

−e

i(t−h)y

) β

⊥

ity

(dy) =

g(y)Λ

(dy),

showing that the process (

∆

)

t∈R

is uniquely determined by (

)

t∈R

. Now we only

need to ague that this type of uniqueness also holds when

is invertible and

= 0.

is invertible, (

)

t∈R

must in fact be stationary (cf. Remark 3.2), and by [4,

Theorem 3.1] there is only one process enjoying this property. If

= 0, the case is

simpler than if

r ∈

), since here we only need to consider the second equation of

(5.17)

with

⊥

and the spectral representation of (

∆

)

t∈R

. To avoid too many

repetitions we leave out the details. 

Proof of Corollary 3.7.

As noted right before the statement we only need to argue

that (3.6) is satisﬁed with respect to the deﬁnition (3.7). In order to do so, note that



∞

f (u)(Z

−Z

t−u

) du



j=1

∞

(t−u,t]

(u) du

j=1



[0,∞)

(t − · )

∞

t− ·

(u) du





−∞

C(t −u) dZ



where

(

) = 0 for

t <

0 and

(

) =

∞

(

)

for

t ≥

0. Now observe that, for

z ∈ C

with Re(z) < 0,

L[C](z) = z

−1



∞

f (t) dt −L[f ](z)



= h

(z)

−1

−z

−1

(5.20)

using Remark 3.6 and Lemma 5.1(ii). Since both sides of

(5.20)

are analytic functions

, the equality holds true on

. This proves that

can be characterized as in

the statement of Theorem 1.2 and, thus, ﬁnishes the proof. 

Remark 5.3.

As was the case for the function

of Lemma 5.1,

can also be obtained

as a solution to a multivariate delay diﬀerential equation. Speciﬁcally, the shifted

function

C(t) = C

+ C(t), t ≥ 0, satisﬁes

C(t) −

C(s) =

C ∗η(u) du, 0 ≤ s < t. (5.21)

177

Paper G · On non-stationary solutions to MSDDEs: representations and the cointegration

space

By Theorem 3.5 the initial condition is

(0) =

. To see that

(5.21)

holds note that,

for ﬁxed 0 ≤ s < t, Lemma 5.1(iii) implies

C(t) −

C(s) = −

f (u) du =



η([0,u])−

f ∗η(v) dv



du,

and

f ∗η(v) dv =

[0,∞)

u−r

f (v) dv η(dr) = η([0,u]) −

C ∗η(u)

by Fubini’s theorem. In the same way as in the proof of Theorem 3.1, one can rely on

integration by parts to write (5.21) in error correction form:

C(t) −

C(s) =

C(u)Π

du +

∞

[

C(t −u)−

C(s −u)]Π(du), 0 ≤s < t.

Proof of Theorem 1.2.

In view of Proposition 3.4 we may assume that Condition 3.3

is satisﬁed. Consequently, by using [4, Example 4.2], which states that an

-dimen-

sional Lévy process with ﬁnite ﬁrst moments is a regular integrator (that is, there

exist

,... , I

satisfying Corollary 3.7(i)–(ii)), the result is an immediate consequence

of Corollary 3.7. 

Proof of Theorem 4.4.

Note that, by Condition 4.3(iv), we can choose

ε >

0 such that

detQ

(

)

0 whenever

(

)

≥ −ε

. To show that

(4.5)

is well-deﬁned and satisﬁes

(1.5)

for some δ > 0 it suﬃces to establish

sup

Re(z)>−ε

z −η

−Q(z)

−1

P (z)k

dIm(z) < ∞. (5.22)

(See, e.g., the beginning of the proof of Lemma 5.1.) It is straightforward to verify that

−P

is chosen such that

z 7→ Q

(

)(

z −η

)

−P

(

) is a polynomial of at most

order

p −

2. Consequently, the integrand in

(5.22)

is of the form

(

)

−1

(

)

, where

is of strictly larger degree than

, and hence it follows by sub-multiplicativity of

k · k

that it decays at least as fast as

|z|

−2

when

|z| → ∞

. Since the integrand is also

bounded on compact subsets of

{z ∈ C

(

)

≤ ε}

we conclude that

(5.22)

is satisﬁed.

Next, we will show that the assumptions of Theorem 1.2 are satisﬁed (which, by

Proposition 3.4, is equivalent to showing that Condition 3.3 holds). Observe that

(

) =

(

)

−1

(

) when

(

)

> −ε

, so by (i) and (iv) in Condition 4.3 it follows that

deth

(

) = 0 implies

(

)

0 or

= 0. Now, a Taylor expansion of

z 7→ Q

(

)

−1

around

0 yields

L[η](z) = η([0,∞)) +



+ Q

−1

p−1

p−2

−1

p−1

−Q

−1

p−1



z + O(z

), |z| → 0,

and hence

Π([0,∞)) =

L[η](z) −η([0,∞))



z=0

= I

−Q

−1

p−1

−Q

p−2

−1

p−1

Let

p−1

⊥

and

⊥

, and note that these matrices are of rank

n −r

and satisfy

α = Π

β = 0. Thanks to Condition 4.3(iii), the matrix

−Π([0,∞)))

β = (α

⊥

)

p−1

⊥

is invertible, so the assumptions of Theorem 1.2 are satisﬁed. The remaining state-

ments are now simply consequences of Corollary 3.7.



178

References

Acknowledgments

I would like to thank Andreas Basse-O’Connor and Jan Pedersen for helpful com-

ments. This work was supported by the Danish Council for Independent Research

(grant DFF–4002–00003).

References

[1]

Barndorﬀ-Nielsen, O.E., J.L. Jensen and M. Sørensen (1998). Some stationary

processes in discrete and continuous time. Adv. in Appl. Probab. 30(4), 989–

1007. doi: 10.1239/aap/1035228204.

[2]

Barndorﬀ-Nielsen, O.E. and A. Basse-O’Connor (2011). Quasi Ornstein–Uhlen-

beck processes. Bernoulli 17(3), 916–941. doi: 10.3150/10-BEJ311.

[3]

Basse-O’Connor, A., M.S. Nielsen, J. Pedersen and V. Rohde (2019). Stochastic

delay diﬀerential equations and related autoregressive models. Stochastics.

Forthcoming. doi: 10.1080/17442508.2019.1635601.

[4]

Basse-O’Connor, A., M.S. Nielsen, J. Pedersen and V. Rohde (2018). Multivariate

stochastic delay diﬀerential equations and CAR representations of CARMA

processes. Stochastic Process. Appl. Forthcoming. doi:

10.1016/j.spa.2018.11

.011.

[5]

Brockwell, P.J. (2014). Recent results in the theory and applications of CARMA

processes. Ann. Inst. Statist. Math. 66(4), 647–685. doi:

10.1007/s10463-014-

0468-7.

[6]

Comte, F. (1999). Discrete and continuous time cointegration. J. Econometrics

88(2), 207–226. doi: 10.1016/S0304-4076(98)00025-6.

[7]

Davis, R.A., M.S. Nielsen and V. Rohde (2019). Stochastic diﬀerential equa-

tions with a fractionally ﬁltered delay: a semimartingale model for long-range

dependent processes. Bernoulli. Forthcoming.

[8]

Dym, H. and H.P McKean (2016). Séries et intégrales de Fourier. Vol. 13. Nou-

velle Bibliothèque Mathématique [New Mathematics Library]. Translated from

the 1972 English original by Éric Kouris. Cassini, Paris.

[9]

Engle, R.F. and C.W.J. Granger (1987). Co-integration and error correction:

representation, estimation, and testing. Econometrica 55(2), 251–276. doi:

10.2

307/1913236.

[10]

Fasen-Hartmann, V. and M. Scholz (2016). Cointegrated Continuous-time

Linear State Space and MCARMA Models. arXiv: 1611.07876.

[11]

Grimmett, G. and D. Stirzaker (2001). Probability and random processes. Oxford

University Press.

[12]

Gushchin, A.A. and U. Küchler (2000). On stationary solutions of delay dif-

ferential equations driven by a Lévy process. Stochastic Process. Appl. 88(2),

195–211. doi: 10.1016/S0304-4149(99)00126-X.

179

Paper G · On non-stationary solutions to MSDDEs: representations and the cointegration

space

[13]

Hansen, P.R. (2005). Granger’s representation theorem: a closed-form expres-

sion for

(1) processes. Econom. J. 8(1), 23–38. doi:

10.1111/j.1368-423X.200

5.00149.x.

[14]

Johansen, S. (1991). Estimation and hypothesis testing of cointegration vectors

in Gaussian vector autoregressive models. Econometrica 59(6), 1551–1580. doi:

10.2307/2938278.

[15]

Johansen, S. (2009). “Cointegration: Overview and development”. Handbook of

ﬁnancial time series. Springer, 671–693.

[16]

Marquardt, T. (2006). Fractional Lévy processes with an application to long

memory moving average processes. Bernoulli 12(6), 1099–1126.

[17]

Marquardt, T. and R. Stelzer (2007). Multivariate CARMA processes. Stochastic

Process. Appl. 117(1), 96–120. doi: 10.1016/j.spa.2006.05.014.

[18]

Mohammed, S.E.A. and M.K.R. Scheutzow (1990). Lyapunov exponents and

stationary solutions for aﬃne stochastic delay equations. Stochastics Stochastics

Rep. 29(2), 259–283.

[19]

Rajput, B.S. and J. Rosiński (1989). Spectral representations of inﬁnitely divisi-

ble processes. Probab. Theory Related Fields 82(3), 451–487.

[20]

Runkle, D.E. (2002). Vector autoregressions and reality. J. Bus. Econom. Statist.

20(1), 128–133. doi: 10.1198/073500102753410435.

[21]

Sato, K., T. Watanabe and M. Yamazato (1994). Recurrence conditions for

multidimensional processes of Ornstein–Uhlenbeck type. J. Math. Soc. Japan

46(2), 245–265. doi: 10.2969/jmsj/04620245.

[22]

Sato, K. and M. Yamazato (1983). “Stationary processes of Ornstein–Uhlenbeck

type”. Probability theory and mathematical statistics (Tbilisi, 1982). Vol. 1021.

Lecture Notes in Math. Springer, Berlin, 541–551. doi: 10.1007/BFb0072949.

[23]

Schumacher, J.M (1991). “System-theoretic trends in econometrics”. Mathe-

matical system theory. Springer, Berlin, 559–577.

[24] Sims, C.A. (1980). Macroeconomics and reality. Econometrica, 1–48.

[25]

Yaglom, A.M (1987). Correlation theory of stationary and related random functions.

Vol. I. Springer Series in Statistics. Basic results. New York: Springer-Verlag.

180

P a p e r

Low Frequency Estimation of Lévy-Driven

Moving Averages

Mikkel Slot Nielsen

Abstract

In this paper we consider least squares estimation of the driving kernel of a

moving average and argue that, under mild regularity conditions and a decay

condition on the kernel, the suggested estimator is consistent and asymptotically

normal. On one hand this result uniﬁes scattered results of the literature on low

frequency estimation of moving averages, and on the other hand it emphasizes

the validity of inference also in cases where the moving average is not strongly

mixing. We assess the performance of the estimator through a simulation study.

Keywords: Least squares estimation; Lévy-driven moving averages; Long memory processes

1 Introduction

The class of continuous time Lévy-driven moving averages of the form

ϕ(t −s) dL

, t ∈ R, (1.1)

where (

)

t∈R

is a Lévy process with

[

] = 0 and

[

]

< ∞

and

ϕ ∈ L

, is large and

has received much attention in earlier literature. Part of the reason for this popularity

might be explained by the celebrated discrete time counterpart (in particular, ARMA

processes) as well as the Wold–Karhunen decomposition. The latter states that, up to

a drift term, essentially any centered and square integrable stationary process may

be written in the form

(1.1)

with (

)

t∈R

replaced by a process with second order

stationary and orthogonal increments ([2, 16]). While

may be speciﬁed directly,

one often characterizes it in the spectral domain in terms of its Fourier transform,

F [ϕ](y) =

∞

−iyt

ϕ(t) dt, y ∈R.

181

Paper H · Low frequency estimation of Lévy-driven moving averages

One class in the framework of

(1.1)

is the continuous time ARMA (CARMA) processes,

where

[

](

) =

(

)

(

) for

y ∈ R

and some monic polynomials

P ,Q : C → C

with

real coeﬃcients,

p B deg

(

)

> deg

(

)

C q

, and

(

)

0 for all

z ∈ C

with

(

)

≥

One may regard a CARMA process as the solution to the formal equation

P (D)X

= Q(D)DL

, t ∈ R, (1.2)

where

denotes the derivative with respect to time. Indeed, by heuristically applying

the Fourier transform to

(1.2)

and rearranging terms one reaches the conclusion that

(

)

t∈R

is the convolution between

and (

)

t∈R

. The simplest CARMA process,

which has been particularly popular, is the Ornstein–Uhlenbeck process which corre-

sponds to

= 1 and

= 0. CARMA processes have been used as models for various

quantities including stochastic volatility, electricity spot prices and temperature dy-

namics ([5, 12, 24]), and there exists a vast amount of literature on their existence,

uniqueness and representations as well as generalizations to the multivariate and

fractional noise setting ([6, 19, 20]). Another class consists of aﬃne stochastic delay

diﬀerential equations (SDDEs) of the form

[0,∞)

t−s

η(ds) dt + dL

, t ∈ R. (1.3)

Here

is a suitable ﬁnite signed measure satisfying

z −

[0,∞)

−zt

(

)

0 for all

z ∈ C

with

(

)

≥

0. In this case, the solution of

(1.3)

is a moving average and the kernel

is determined by the relation

F [ϕ](y) =



iy −

[0,∞)

−iyt

η(dt)



−1

, y ∈ R. (1.4)

The choice

−λδ

λ >

0, results in the Ornstein–Uhlenbeck process; a related

example is considered in Example 3.2. (We use the notation

for the Dirac measure

at x.) Some relevant references on SDDEs are [4, 14].

Estimation of

and

, given a sample

n:1

= [

n∆

(n−1)∆

,... , X

∆

]

of equidistant

observations of a CARMA process sampled at some frequency

∆ >

0, has received

some attention. For instance, Brockwell et al. [8] show that a sampled CARMA process

(

t∆

)

t∈Z

is a weak ARMA process. By combining this with the fact that CARMA

processes are strongly mixing ([20, Proposition 3.34]), they can rely on general results

of Francq and Zakoïan [11] to prove strong consistency and asymptotic normality

for an estimator of least squares type. Other papers dealing with low frequency

estimation of CARMA processes are [10, 22]. Küchler and Sørensen [18] studied low

frequency parametric estimation of the measure

(1.3)

in case the support of the

measure is known to be contained in some compact set and (

)

t∈R

is a Brownian

motion. They used results about strong mixing properties of Gaussian processes

to obtain consistency and asymptotic normality of a maximum pseudo likelihood

estimator. Generally, these results for CARMA processes and solutions to SDDEs

cannot be extended to other parametric classes of

(1.1)

, since they use speciﬁc

properties of the subclass in question. Indeed, strong mixing conditions may be

diﬃcult to verify and there exist several non-trivial examples of processes which

are not strongly mixing (see the discussion and the corresponding examples in [1]).

There exist results on strong mixing properties for discrete time moving averages,

182

2 · Estimators of interest and asymptotic results

such as [13], but to the best of our knowledge, no version for the continuous time

counterpart (1.1) has been proven (not even when it is sampled on a discrete grid).

In this paper we provide a result (Theorem 2.4) concerning consistency and

asymptotic normality of an estimator of least squares type when parametrically

estimating

(1.1)

from a sample of low frequency observations

n:1

. To be more

concrete, let

be a compact subset of

, let

∈ L

for

θ ∈ Θ

, and suppose that

(

)

t∈R

follows the model

(1.1)

with

for some unknown parameter

∈ Θ

Then we will be interested in the estimator

obtained as a point, which minimizes

t=k+1

t∆

−π

t∆

;θ))

, θ ∈ Θ, (1.5)

where

(

t∆

;

) denotes the projection of

t∆

onto the linear

(

) subspace spanned

(t−1)∆

,... , X

(t−k)∆

under the model

(1.1)

with

. Besides the usual identiﬁ-

ability and smoothness conditions, the conditions given here to ensure asymptotic

normality of the estimator concern the decay of the kernel. This ensures that we can

apply our result in situations where the process is not, or cannot be veriﬁed to be,

strongly mixing. In cases where

can be speciﬁed directly, e.g., when it belongs to

the class of CARMA processes or fractional noise processes, it is a straightforward

task to check the decay condition, but even when the kernel is not explicitly known

(e.g., when it can only be speciﬁed through its Fourier transform as in the SDDE case)

one can sometimes still assess its decay properties. In Example 2.3 we consider some

situations where the imposed decay condition is satisﬁed. Section 3 demonstrates the

properties of the estimator through a simulation study.

2 Estimators of interest and asymptotic results

Let (

)

t∈R

be a centered Lévy process with

[

] = 0 and

[

]

< ∞

, and suppose that

[

] = 1. Moreover, let

be a compact subset of

and, for each

θ ∈ Θ

, suppose

that ϕ

∈ L

and deﬁne the corresponding stationary process (X

)

t∈R

(t −s) dL

, t ∈ R. (2.1)

To avoid trivial cases we assume that

(

)

}

is not a Lebesgue null set. Let

be the autocovariance function of (X

)

t∈R

, that is,

(h) B E[X

] =

(h + t)ϕ

(t) dt, h ∈ R. (2.2)

It will be assumed throughout that

θ 7→ γ

(

) is twice continuously diﬀerentiable

for all

. Recall that, for ﬁxed

∆ >

0 and any

t ∈ Z

, the projection of

t∆

onto the lin-

ear span of

(t−1)∆

,... , X

(t−k)∆

is given by

(

)

t−1:t−k

where

(

) =

(

)

−1

(

) = [

((

i −j

)

∆

)]

i,j=1,...,k

is the covariance matrix of

t−1:t−k

, and

(

) = [

(

∆

)

..., γ

(

k∆

)]

. (Here we use the notation

t:s

= [

t∆

(t−1)∆

,... , Y

s∆

]

for

s, t ∈ Z

with

s < t

.) Note that by [7, Proposition 5.1.1],

(

) is always invertible. Now suppose that

for all

t ∈ R

and some unknown parameter

belonging to the interior of

, and consider

equidistant observations

n:1

= [

n∆

,... , X

∆

]

. We will estimate

183

Paper H · Low frequency estimation of Lévy-driven moving averages

by the least squares estimator

, which is chosen to minimize

(1.5)

. Thus, with the

introduced notation,

∈ argmin

θ∈Θ

t=k+1

t∆

−α

(θ)

(t−1):(t−k)

)

. (2.3)

The estimator (2.3) can be seen as a truncated version of

∈ argmin

θ∈Θ

t=2

t∆

−α

t−1

(θ)

(t−1):1

)

. (2.4)

From an implementation point of view, while evaluation of the objective function

(2.4)

will demand computing

(

)

,... , α

n−1

(

) (usually obtained recursively by

the Durbin–Levinson algorithm [7, Proposition 5.2.1]), one only needs to compute

(

) in order to evaluate the objective function in

(2.3)

. As discussed in [18], in

short-memory models where the projection coeﬃcients are rapidly decaying it is

reasonable to use

with a suitably chosen depth k as a proxy for (2.4).

To show strong consistency and asymptotic normality of

we impose the follow-

ing set of conditions:

Condition 2.1.

(a) γ

(j∆) = γ

(j∆) for j = 0,1,...,k if and only if θ = θ

(b) γ

(θ

) −Γ

(θ

)[α

(θ

) ⊗I

] has full rank.

(c)



t 7−→

s∈Z

|ϕ

(t + s∆)|



∈ L

([0,∆]) for β = 4/3,2.

Remark 2.2.

Concerning Condition 2.1, (a)–(b) are standard assumptions ensuring

that

is identiﬁable from the autocovariances and that the (suitably scaled version

of the) second derivative of the objective function in (2.3) converges to an invertible

deterministic matrix. The diﬀerence between Condition 2.1 and the typical set of

conditions for proving asymptotic normality is that an assumption on the strong

mixing coeﬃcients of (

t∆

)

t∈Z

is replaced by (c), a rather explicit condition on the

driving kernel. In fact, according to [21, Theorem 1.2], suﬃcient conditions for (c) to

be satisﬁed are that

∈ L

and sup

t∈R

|t|

|ϕ

(t)| < ∞ (2.5)

for a suitable β ∈ (3/4,1).

Example 2.3.

In view of Remark 2.2 the key condition to check is if we are in a sub-

class of moving average processes, where

(2.2)

(or, more generally, Condition 2.1(c))

holds true. In the following we consider a few popular classes of kernels ϕ.

(i)

CARMA and gamma: It is clear that the gamma kernel

(

)

∝ t

−γt

meets

(2.2)

when

β ∈

(

−

,∞

) and

γ ∈

,∞

). The CARMA kernel characterized in

Section 1 can always be bounded by a sum of gamma kernels (see, e.g., [6,

Equation (36)]), and hence (2.2) is satisﬁed for this choice as well.

(ii)

SDDE: If the variation

|η|

satisﬁes

[0,∞)

|η|

(

)

< ∞

, it follows by [21,

Example 3.10] that the kernel ϕ associated to the solution of (1.3) meets (2.2).

184

2 · Estimators of interest and asymptotic results

(iii)

Fractional noise: If

(

)

∝ t

−

(

t −τ

)

for some

d ∈

4) and

τ ∈

,∞

), then

is continuous and the mean value theorem implies that

(

) is asymptotically

proportional to

d−1

t → ∞

. These properties establish the validity of

(2.2)

Note that the corresponding discretely sampled moving average (

t∆

)

t∈Z

is not

strongly mixing in this setup (cf. [9, Theorem A.1]).

Before stating and proving consistency and asymptotic normality of

(2.3)

introduce some notation. For a twice continuously diﬀerentiable function

, deﬁned

on some open subset of

and with values in

, the gradient and Hessian of

are denoted by f

(θ) and f

(θ), respectively:

(θ) =



∂f

∂θ

(θ),...,

∂f

∂θ

(θ)



∈ R

m×d

, f

(θ) =







∂

∂θ

···

∂

∂θ

∂

∂θ

···

∂

∂θ







∈ R

dm×d

Moreover, with

(

)

= [1

,−α

(

)

(

)

= [0

,α

(

)

] and

(

;

) = [

(

t −

(

i −

1)∆)ϕ

(t −(j −s −1)∆)]

i,j=1,...,k+1

we deﬁne

(t;θ) = v

(θ)

(t;θ)v

(θ) for i,j = 1,2 and s ∈ Z. (2.6)

Finally, we set σ

= E[L

] and κ

= E[L

] −3σ

Theorem 2.4.

Suppose that

belongs to the interior of

and that Condition 2.1 is in

force. Let

be the estimator given in

(2.3)

. Then

→ θ

almost surely and

√

(

−

)

−−−→N (0, H

−1

) as n → ∞, where H = 2α

(θ

)

(θ

)α

(θ

) and

A =

s∈Z



(t;θ

) dt + σ

(t;θ

) dt

(t;θ

) dt

+ σ

(t;θ

) dt

(t;θ

) dt



(2.7)

Proof.

Set

(

) =

t=k+1

(

t∆

−α

(

)

(t−1):(t−k)

))

, and let

and

be the ﬁrst and

second order derivative of

, respectively. As usual, the consistency and part of the

asymptotic normality rely on an application of a suitable (uniform) ergodic theorem

to ensure almost sure convergence of the sequences (

−1

)

n∈N

and (

−1

)

n∈N

. The

diﬀerence lies in the proof of a central limit theorem for (n

−1/2

(θ

))

n∈N

Consistency: Note that

[

sup

θ∈Θ

(

k∆

−α

(

)

(k−1):0

)

]

< ∞

, since the vector of

projection coeﬃcients

(

) is bounded due to the continuity of

θ 7→ γ

(

). Thus, we

ﬁnd by the ergodic theorem for Banach spaces ([23, Theorem 2.7]) that

−1

(

)

→

[(

k∆

−α

(

)

(k−1):0

)

]

C `

∗

(

) almost surely and uniformly in

n → ∞

. Thus,

strong consistency follows immediately if

∗

is uniquely minimized at

. Since

(

)

(k−1):0

is the projection of

k∆

onto the linear span of

,... , X

(k−1)∆

, it must

be the case that

∗

(

)

≤ `

∗

(

) for all

θ ∈ Θ

. If

θ , θ

, Condition 2.1(a) implies that

(

j∆

)

, γ

(

j∆

) for at least one

, and hence

∗

(

)

< `

∗

(

) by uniqueness of the

projection coeﬃcients.

185

Paper H · Low frequency estimation of Lévy-driven moving averages

Asymptotic normality: It suﬃces to show that (i)

−1

(

) converges almost surely

and uniformly in

n → ∞

and

H B lim

n→∞

−1

(

) is a deterministic positive

deﬁnite matrix, and (ii)

−1/2

(

) converges in distribution to a Gaussian random

variable. Concerning (i), note that

(θ) = 2

t=k+1

(θ)

(t−1):(t−k)

(θ)

−(X

t∆

−α

(θ)

(t−1):(t−k)

)[X

(t−1):(t−k)

⊗I

]α

(θ)

where

is the

d ×d

identity matrix and the

th row of

(resp. the

d ×d

block

) is the gradient (resp. Hessian) of the

th entry of

. Thus, it follows by

[23, Theorem 2.7] that

−1

(

)

→

(

)

(

)

(

)

C H

(

) almost surely and

uniformly in θ as n → ∞. Since Γ

(θ

) is positive deﬁnite and

(θ

) = Γ

(θ

)

−1



(θ

) −Γ

(θ

)[α

(θ

) ⊗I

]



it follows from Condition 2.1(b) that

(

) is positive deﬁnite. To show (ii),

observe that `

(θ

) takes the form

(θ

) =

t=k+1

(t∆ −s) dL

with

(

) =

(

)

(

), using the notation

(

) = [

(

)

,ϕ

(

t −∆

)

,... , ϕ

(

t −

k∆)]

. Since the space of functions f satisfying



t 7−→

s∈Z

|f (t + s∆)|



∈ L

([0,∆]) for β = 4/3,2 (2.8)

forms a vector space, and

satisﬁes

(2.8)

by Condition 2.1(c),

and (each entry

of)

satisfy

(2.8)

as well. Moreover, as

(

t∆ −s

)

t∆

−α

(

)

(t−1):(t−k)

is orthogonal to

(

t∆ − s

)

(t−1):(t−k)

(

) in

(

) (entrywise), we have

that

[

(

)] = 0. Consequently, by [21, Theorem 1.2],

−1/2

(

) converges in

distribution to a centered Gaussian vector with covariance matrix given by

s∈Z



(t)ψ

(t + s∆)ψ

(t)ψ

(t + s∆)

dt + σ

(t)ψ

(t + s∆) dt

(t)ψ

(t + s∆)

dt + σ

(t)ψ

(t + s∆) dt

(t)

(t + s∆) dt



which is equal to A given in (2.7). This concludes the proof. 

3 Examples

In this section we give two examples where Theorem 2.4 is applicable and accompany

these by simulating the properties of the estimator

. In both examples we ﬁx the

sample frequency

∆

= 1 as well as the depth

= 10. We have checked (by simulation)

that the estimator is rather insensitive to the choice of

; this is supported by the fact

that both models result in geometrically decaying projection coeﬃcients.

186

3 · Examples

Example 3.1.

Suppose that (

)

t∈R

is a standard Brownian motion and, for

= (

ν,λ

)

∈

(3/4,∞) ×(0,∞), set

(t) = Γ (ν)

−1

ν−1

−λt

, t > 0. (3.1)

The moving average model

(2.1)

with gamma kernel

(3.1)

has received some attention

in the literature and has, e.g., been used to model the timewise behavior of the velocity

in turbulent regimes (see [3] and references therein). Moreover, particular choices of

result in special cases of well-known and widely studied models. To be concrete, if

= 1 then (

)

t∈R

is an Ornstein–Uhlenbeck process with parameter

λ >

0 and, more

generally, if

ν ∈ N

then (

)

t∈R

is a CAR(

) process with polynomial

(

) = (

)

The autocovariance function

of (

)

t∈R

under the model speciﬁcation

(2.1)

and

(3.1) takes the form

(h) =











Γ (2ν −1)(2λ)

1−2ν

if h = 0,

Γ (ν)(2π

−1

)

1/2

−ν

(λ

−1

|h|)

ν−1/2

(λ|h|) if h , 0,

where

ν−1/2

denotes the modiﬁed Bessel function of the third kind of order

ν−

2 (cf.

[3]). The corresponding autocorrelation function

/γ

(0) is known as the Whittle–

Matérn correlation function ([15]). In Figure 1 we have simulated

400:1

and plotted

the corresponding sample and theoretical autocorrelation function for

= (1

1).

To demonstrate the ability to infer the true parameter

= (

,λ

) from

n:1

using

0 50 100 150 200 250 300 350 400

-2

-1.5

-1

-0.5

0.5

1.5

2.5

0 5 10 15

-0.2

0.2

0.4

0.6

0.8

Figure 1:

Left: simulation of

400:1

under the model speciﬁcation

(2.1)

and

(3.1)

with

= (1

1) when

(

)

t∈R

is a Brownian motion. Right: the corresponding sample autocorrelation function and its theoretical

counterpart.

the least squares estimator

(2.3)

we simulate

n:100

under the model corresponding

for

400,1600,6400

, obtain the associated realizations of

= (

) for

truncation lag

= 10 and repeat the experiment

500

times. We perform this study

for diﬀerent choices of

. In Table 1 we have, for each

, summarized the sample

mean, bias and variance for the realizations of the least squares estimator. To show

the robustness regarding the choice of the underlying noise we did the same analysis

in the case where (

)

t∈R

is a centered gamma Lévy process with both shape and

scale parameter equal to one. In other words, (

)

t∈R

was chosen to be the unique

Lévy process where

has density

t 7→ 1

{t≥−1}

−t−1

. The ﬁndings, which are reported

in Table 2, are seen to be similar to those of Table 1. To illustrate the asymptotic

187

Paper H · Low frequency estimation of Lévy-driven moving averages

normality of

we have plotted histograms based on the

500

realizations of

and

when n = 6400 in the situation where (ν

,λ

) = (1.3,1.1), see Figure 2.

Example 3.2.

As in the last part of Example 3.1 let (

)

t∈R

be a centered gamma Lévy

process with both shape and scale parameter equal to one, and consider the model

(1.3) where η = αδ

+ βδ

for some α,β ∈ R:

= (αX

+ βX

t−1

) dt + dL

, t ∈ R. (3.2)

We will perform a simulation study similar to that of [18], except that they consider

a Brownian motion as the underlying noise and use a certain pseudo (Gaussian)

likelihood rather than the least squares estimator in

(2.3)

. In [17] it is argued that a

stationary solution to (3.2) exists if α < 1 and

β ∈













−

cos(ξ(α))

,−α



if α , 0,



−



if α = 0.

The function

is characterized by

(0) =

π/

2 and

(

) =

t tan

(

)) for

t ,

0. We will

compute (2.3) by using that

(h) = 2

∞

cos(hy)

|iy + α + βe

dy, h ∈ R,

which follows from

(1.4)

(2.2)

and Plancherel’s theorem. We let (

,β

) = (

−

,−

1353)

in line with [18], and in Table 3 we provide statistics similar to those of Tables 1–2.

Table 1:

Sample mean, bias and variance based on

500

realizations of

= (

) various choices of

The noise is a Brownian motion.

n 400 1600 6400

= 1.3

= 1.1

Mean 1.3869 1.1613 1.3353 1.1271 1.3008 1.0982

Bias 0.0869 0.0613 0.0353 0.0271 0.0008 −0.0018

Var.×10 1.2143 1.5452 0.3553 0.4501 0.0749 0.1039

= 0.9

= 1.1

Mean 1.1460 1.4244 0.9867 1.2151 0.9092 1.1079

Bias 0.2460 0.3244 0.0857 0.1151 0.0092 0.0079

Var.×10 2.0205 4.7529 0.6148 1.6267 0.0851 0.3354

= 1.3

= 0.5

Mean 1.3202 0.5166 1.3079 0.5060 1.2989 0.4987

Bias 0.0202 0.0166 0.0079 0.0060 −0.0011 −0.0013

Var.×10 0.1910 0.1424 0.0417 0.0333 0.0099 0.0079

188

3 · Examples

Table 2:

Sample mean, bias and variance based on

500

realizations of

= (

) various choices of

The noise is a centered gamma Lévy process.

n 400 1600 6400

= 1.3

= 1.1

Mean 1.3638 1.1537 1.3158 1.1234 1.2870 1.0969

Bias 0.0638 0.0537 0.0158 0.0234 −0.0130 −0.0031

Var.×10 1.1162 1.5069 0.3358 0.4505 0.0729 0.1061

= 0.9

= 1.1

Mean 1.1339 1.3813 1.0049 1.2249 0.9323 1.1262

Bias 0.2339 0.2813 0.1049 0.1249 0.0323 0.0262

Var.×10 1.8303 4.3999 0.5879 1.5714 0.0900 0.3446

= 1.3

= 0.5

Mean 1.3017 0.5095 1.2902 0.5000 1.2871 0.4964

Bias 0.0017 0.0095 −0.0098 0.0000 −0.0129 −0.0036

Var.×10 0.1615 0.1352 0.0401 0.0319 0.0097 0.0079

1 1.1 1.2 1.3 1.4 1.5 1.6

100

120

0.8 0.9 1 1.1 1.2 1.3 1.4

100

Figure 2:

Histograms of

500

realizations of (

6400

) when (

,λ

) = (1

1) and the noise is a

gamma Lévy process.

Table 3:

Sample mean, bias and variance based on

500

realizations of

= (

) various choices of

when the true parameters are α

= −1 and β

= −0.1353. The noise is a centered gamma Lévy process.

n 400 1600 6400

Mean −0.9980 −0.1654 −1.0127 −0.1508 −1.0132 −0.1459

Bias 0.0020 −0.0301 −0.0127 −0.0155 −0.0132 −0.0106

Var.×10 0.3022 0.0979 0.1165 0.0498 0.0379 0.0189

189

Paper H · Low frequency estimation of Lévy-driven moving averages

Acknowledgments

This work was supported by the Danish Council for Independent Research (grant

DFF–4002–00003).

References

[1]

Ango Nze, P., P. Bühlmann and P. Doukhan (2002). Weak dependence beyond

mixing and asymptotics for nonparametric regression. Ann. Statist. 30(2), 397–

430. doi: 10.1214/aos/1021379859.

[2]

Barndorﬀ-Nielsen, O.E. and A. Basse-O’Connor (2011). Quasi Ornstein–Uhlen-

beck processes. Bernoulli 17(3), 916–941. doi: 10.3150/10-BEJ311.

[3]

Barndorﬀ-Nielsen, O.E. et al. (2012). Notes on the gamma kernel. Thiele Re-

search Reports, Department of Mathematics, Aarhus University.

[4]

Basse-O’Connor, A., M.S. Nielsen, J. Pedersen and V. Rohde (2018). Multivariate

stochastic delay diﬀerential equations and CAR representations of CARMA

processes. Stochastic Process. Appl. Forthcoming. doi:

10.1016/j.spa.2018.11

.011.

[5]

Benth, F.E., J. Šaltyt

e-Benth and S. Koekebakker (2007). Putting a price on

temperature. Scand. J. Statist. 34(4), 746–767. doi:

10.1111/j.1467-9469.200

7.00564.x.

[6]

Brockwell, P.J. (2014). Recent results in the theory and applications of CARMA

processes. Ann. Inst. Statist. Math. 66(4), 647–685. doi:

10.1007/s10463-014-

0468-7.

[7]

Brockwell, P.J. and R.A. Davis (2006). Time series: theory and methods. Springer

Series in Statistics. Reprint of the second (1991) edition. Springer, New York.

[8]

Brockwell, P.J., R.A. Davis and Y. Yang (2011). Estimation for non-negative

Lévy-driven CARMA processes. J. Bus. Econom. Statist. 29(2), 250–259. doi:

10.1198/jbes.2010.08165.

[9]

Cohen, S. and A. Lindner (2013). A central limit theorem for the sample auto-

correlations of a Lévy driven continuous time moving average process. J. Statist.

Plann. Inference 143(8), 1295–1306. doi: 10.1016/j.jspi.2013.03.022.

[10]

Fasen-Hartmann, V. and S. Kimmig (2018). Robust estimation of continuous-

time ARMA models via indirect inference. arXiv: 1804.00849.

[11]

Francq, C. and J.

M. Zakoïan (1998). Estimating linear representations of

nonlinear processes. J. Statist. Plann. Inference 68(1), 145–165.

[12]

García, I., C. Klüppelberg and G. Müller (2011). Estimation of stable CARMA

models with an application to electricity spot prices. Stat. Model. 11(5), 447–

470. doi: 10.1177/1471082X1001100504.

[13]

Gorodetskii, V. (1978). On the strong mixing property for linear sequences.

Theory Probab. Appl. 22(2), 411–413.

190

References

[14]

Gushchin, A.A. and U. Küchler (2000). On stationary solutions of delay dif-

ferential equations driven by a Lévy process. Stochastic Process. Appl. 88(2),

195–211. doi: 10.1016/S0304-4149(99)00126-X.

[15]

Guttorp, P. and T. Gneiting (2005). On the Whittle-Matérn correlation family.

National Research Center for Statistics and the Environment-Technical Report Series,

Seattle, Washington.

[16]

Karhunen, K. (1950). Über die Struktur stationärer zufälliger Funktionen. Ark.

Mat. 1, 141–160. doi: 10.1007/BF02590624.

[17]

Küchler, U. and B. Mensch (1992). Langevin’s stochastic diﬀerential equation

extended by a time-delayed term. Stochastics Stochastics Rep. 40(1-2), 23–42.

doi: 10.1080/17442509208833780.

[18]

Küchler, U. and M. Sørensen (2013). Statistical inference for discrete-time

samples from aﬃne stochastic delay diﬀerential equations. Bernoulli 19(2),

409–425. doi: 10.3150/11-BEJ411.

[19]

Marquardt, T. (2007). Multivariate fractionally integrated CARMA processes.

Journal of Mult. Anal. 98(9), 1705–1725.

[20]

Marquardt, T. and R. Stelzer (2007). Multivariate CARMA processes. Stochastic

Process. Appl. 117(1), 96–120. doi: 10.1016/j.spa.2006.05.014.

[21] Nielsen, M.S. and J. Pedersen (2019). Limit theorems for quadratic forms and

related quantities of discretely sampled continuous-time moving averages.

ESAIM: Probab. Stat. Forthcoming. doi: 10.1051/ps/2019008.

[22]

Schlemm, E. and R. Stelzer (2012). Quasi maximum likelihood estimation for

strongly mixing state space models and multivariate Lévy-driven CARMA

processes. Electron. J. Stat. 6, 2185–2234. doi: 10.1214/12-EJS743.

[23]

Straumann, D. and T. Mikosch (2006). Quasi-maximum-likelihood estimation

in conditionally heteroscedastic time series: a stochastic recurrence equations

approach. Ann. Statist. 34(5), 2449–2495. doi:

10.1214/009053606000000803

[24]

Todorov, V. and G. Tauchen (2006). Simulation methods for Lévy-driven

continuous-time autoregressive moving average (CARMA) stochastic volatility

models. J. Bus. Econom. Statist. 24(4), 455–469. doi:

10.1198/07350010600000

0260.

191

P a p e r

A Statistical View on a Surrogate Model for

Estimating Extreme Events with an

Application to Wind Turbines

Mikkel Slot Nielsen and Victor Rohde

Abstract

In the present paper we propose a surrogate model, which particularly aims at

estimating extreme events from a vector of covariates and a suitable simulation

environment. The ﬁrst part introduces the model rigorously and discusses the

ﬂexibility of each of its components by drawing relations to literature within ﬁelds

such as incomplete data, statistical matching, outlier detection and conditional

probability estimation. In the second part of the paper we study the performance

of the model in the estimation of extreme loads on an operating wind turbine

from its operational statistics.

MSC: 62P30; 65C20; 91B68

Keywords: Extreme event estimation; Wind turbines; Surrogate model

1 Introduction

Suppose that we are interested in the distributional properties of a certain one-

dimensional random variable

. For instance, one may want to know the probability

of the occurrence of large values of

as they could be associated with a large risk

such as system failure or a company default. One way to evaluate such risks would be

to collect observations

,... , y

and then ﬁt a suitable distribution (for instance,

the generalized Pareto distribution) to the largest of them. Extreme event estimation

is a huge area and there exists a vast amount on literature of both methodology and

193

Paper I · A statistical view on a surrogate model for estimating extreme events with an

application to wind turbines

applications; a few references are [4, 5, 12, 17]. This is one example where knowledge

of the empirical distribution of Y ,

(δ

,... , δ

) =

i=1

, (1.1)

is valuable. (Here

denotes the Dirac measure at the point

.) If one is interested in

the entire distribution of

, one may use the estimator

(1.1)

directly or a smoothed

version, for example, replacing

by the Gaussian distribution with mean

and

variance

0 (the latter usually referred to as the bandwidth). The problem in

determining

(1.1)

arises if

is not observable. Such a situation can happen for several

reasons, for instance, it may be that

is diﬃcult or expensive to measure or that its

importance has just recently been recognized, and hence one have not collected the

historic data that is needed. Sometimes, a solution to the problem of having a latent

variable could be to set up a suitable simulation environment and, by varying the

conditions of the system, obtain various realizations of

. Since we cannot be sure

that the variations in the simulation environment correspond to the variations in the

physical environment, the realizations of

are not necessarily drawn from the true

distribution. This is essentially similar to any experimental study and one will have

to rely on the existence of control variables.

By assuming the existence of an observable

-dimensional vector

of covariates

carrying information about the environment, a typical way to proceed would be

regression/matching which in turn would form a surrogate model. To be concrete,

given a realization

, a surrogate model is expected to output (approximately)

(

) =

[

Y | X

], the conditional mean of

given

. Consequently, given

inputs

,... , x

, the model would produce

(

)

,... , f

(

) as stand-ins for the missing

values

,... , y

. Building a surrogate for the distribution of

on top of this could

now be done by replacing

(

) in

(1.1)

to obtain an estimate

(

f (x

)

,... , δ

f (x

)

of the distribution of

. This surrogate model for the distribution of

can thus be

seen as a composition of two maps:

,... , x

) −→(δ

f (x

)

,... , δ

f (x

)

) −→

(δ

f (x

)

,... , δ

f (x

)

). (1.2)

In the context of an incomplete data problem, the strategy of replacing unobserved

quantities by the corresponding conditional means is called regression imputation

and will generally not provide a good estimate of the distribution of

. For instance,

while the (unobtainable) estimate in

(1.1)

converges weakly to the distribution of

as the sample size

increases, the one provided by

(1.2)

converges weakly to the

distribution of the conditional expectation

[

Y |X

] of

given

. In fact, any of the so-

called single imputation approaches, including regression imputation, usually results

in proxies

,... ,

which exhibit less variance than the original values

,... , y

, and

in this case

(

,... , δ

) will provide a poor estimate of the distribution of

(see

[15] for details).

The reason that the approach

(1.2)

works unsatisfactory is that

f (X)

is an (unbi-

ased) estimator for the distribution of

[

Y |X

] rather than of

. For this reason we

will replace

f (x)

by an estimator for the conditional distribution

given

and maintain the overall structure of (1.2):

,... , x

) −→(µ

,... , µ

) −→

(µ

,... , µ

). (1.3)

194

2 · The model

In Section 2 we introduce the model

(1.3)

rigorously and relate the assumptions on

the simulation environment needed to estimate

to the classical strong ignorability

(or unconfoundedness) assumption within a matching framework. Given a simulation

environment that satisﬁes this assumption, an important step in order to apply the

surrogate model

(1.3)

is of course to decide how to estimate

, and hence we discuss

in Section 2.1 some methods that are suitable for conditional probability estimation.

In Section 2.2 we address the issue of checking if the simulation environment meets

the imposed assumptions. Finally, in Section 3 we apply the surrogate model to real-

world data as we estimate extreme tower loads on a wind turbine from its operational

statistics.

2 The model

Let

be the physical probability measure. Recall that

is the one-dimensional

random variable of interest,

is a

-dimensional vector of covariates and

,... , x

are realizations of

under

. We are interested in a surrogate model that delivers an

estimate of P(Y ∈ B) for every measurable set B. The model is given by

i=1

, (2.1)

where

is an estimator for the conditional distribution

given

. Since

each

is drawn independently of

under

, each

provides an estimator of

and the averaging in

(2.1)

may be expected to force the variance of the estimator

to zero as

tends to inﬁnity. In order to obtain

we need to assume the existence of

a valid simulation tool:

Condition 2.1.

Realizations of (

X,Y

) can be obtained under an artiﬁcial probability

measure Q which satisﬁes

(i) The support of P(X ∈ · ) is contained in the support of Q(X ∈ · ).

(ii)

The conditional distribution of

given

is the same under both

and

, that

is,

Q(Y ∈ · | X = x) = µ

for all x in the support of P(X ∈ · ).

In words, Condition 2.1 says that any outcome of

that can happen in the real world

can also happen in the simulation environment and, given an outcome of

, the

probabilistic structure of

in the real world is perfectly mimicked by the simulation

tool. Note that, while this is a rather strict assumption, it may of course be relaxed to

(

Y ∈B | X

) =

(

) for all

in the support of

(

X ∈ ·

) and any set

of interest.

For instance, in Section 3 we will primarily be interested in

= (

τ,∞

) for a large

threshold τ.

Remark 2.2.

We can assume, possibly by modifying the sample space, the existence

of a random variable Z ∈ {0,1} and a probability measure

P such that

P =

P( · | Z = 0) and Q =

P( · | Z = 1).

195

Paper I · A statistical view on a surrogate model for estimating extreme events with an

application to wind turbines

Eﬀectively,

indicates whether we are using the simulation tool or not, and

(

= 1)

∈

1) deﬁnes the probability of drawing (

X,Y

) from the simulation environment (as

opposed to drawing

from the measurement environment). In this case, according

to Bayes’ rule, Condition 2.1 is equivalent to

P(Z = 1 | X,Y ) =

P(Z = 1 | X). (2.2)

In words,

(2.2)

means that

and

are conditionally independent under

given

The assumption

(2.2)

was introduced in Rosenbaum and Rubin [13] as the strong

ignorability assumption in relation to estimating heterogeneous treatment eﬀects.

In the literature on incomplete data, where

indicates whether

is observed or

not,

(2.2)

is usually known as the Missing at Random (in short, MAR) mechanism,

referring to the pattern of which

is missing. This assumption is often imposed

and viewed as necessary in order to do inference. See [9, 14, 15] for details about the

incomplete data problem and the MAR mechanism.

Remark 2.3.

Usually, to meet Condition 2.1(ii), one will search for a high-dimensional

(large

) to control for as many factors as possible. However, as this complicates

the estimation of

, one may be interested in ﬁnding a function

b : R

→ R

m < d

maintaining the property

P(Y ∈ · | b(X) = b(x)) = Q(Y ∈ · | b(X) = b(x)) (2.3)

for all

in the support of

(

X ∈ ·

). This is a well-studied problem in statistical

matching with the main reference being Rosenbaum and Rubin [13], who referred

to any such

as a balancing function. They characterized the class of balancing

functions by ﬁrst showing that

(2.3)

holds if

is chosen to be the propensity score

under

(cf. Remark 2.2),

(

) =

(

= 1

| X

), and next arguing that a general

function b is a balancing function if and only if

f (b(x)) = π(x) for some function f . (2.4)

2.1 Estimation of the conditional probability

The ultimate goal is to estimate

(

Y ∈ · | X

), for instance, in terms of

the cumulative distribution function (CDF) or density function, from a sample

(

)

,... ,

(

) of (

X,Y

) under the artiﬁcial measure

. (We use the notation

rather than

to emphasize that the quantities are simulated values and should

not be confused with

(2.1)

.) The literature on conditional probability estimation

is fairly large and includes both parametric and non-parametric approaches varying

from simple nearest neighbors matching to sophisticated deep learning techniques.

A few references are [7, 8, 10, 18]. In Section 3 we have chosen to use two simple but

robust techniques in order to estimate µ

(i)

Smoothed

-nearest neighbors: for a given

k ∈ N

k ≤ m

, let

(

)

⊆ {

,... , m}

denote

the

indices corresponding to the

points in

,... , x

}

which are closest to

with respect to some distance measure. Then µ

is estimated by

i∈I

(x)

N (y

,σ),

196

2 · The model

where

(

ξ,σ

) denotes the Gaussian distribution with mean

and standard

deviation σ ≥ 0 (using the convention N (ξ,0) = δ

(ii)

Smoothed random forest classiﬁcation: suppose that one is interested in the CDF

at certain points

< α

< ··· < α

and consider the random variable

C ∈

{

,... , k}

deﬁned by

j=1

{Y >α

}

. From

,... , y

one obtains realizations

,... , c

under

and, next, random forest classiﬁcation (as described in

[2]) can be used to obtain estimates of the functions

(x) = Q(C = j |X = x), j = 0,1,...,k −1.

Given these estimates, say

,... ,

k−1

, the CDF of µ

is estimated by

((−∞,α

]) =

j=1

j−1

(x)Φ



−α



, i = 1,...,k,

where

is the CDF of a standard Gaussian distribution (using the convention

Φ( · /0) = 1

[0,∞)

Both techniques are easily implemented in Python using modules from the scikit-

learn library (see [11]). The distance measure

, referred to in (i), would usually be of

the form

d(x,y) =

(x −y)

M(x −y), x,y ∈ R

for some positive deﬁnite

d×d

matrix

. If

is the identity matrix,

is the Euclidean

distance, and if M is the inverse sample covariance matrix of the covariates, d is the

Mahalanobis distance. Note that, since the

-nearest neighbors (

NN) approach

suﬀers from the curse of dimensionality, one would either require that

is low-

dimensional, reduce the dimension by applying dimensionality reduction techniques

or use another balancing function than the identity function (that is, ﬁnding an

alternative function b satisfying (2.4)).

2.2 Validation of the simulation environment

The validation of the simulation environment concerns how to evaluate whether

or not Condition 2.1 is satisﬁed. Part (i) of the condition boils down to checking

whether it is plausible that a realization

under the physical measure

could

also happen under the artiﬁcial measure

or, by negation, whether

is an outlier

relative to the simulations of

. Outlier detection methods have received a lot of

attention over decades and, according to Hodge and Austin [6], they generally fall

into one of three classes: unsupervised clustering (pinpoints most remote points to

be considered as potential outliers), supervised classiﬁcation (based on both normal

and abnormal training data, an observation is classiﬁed either as an outlier or not)

and semi-supervised detection (based on normal training data, a boundary deﬁning

the set of normal observations is formed). We will be using a

NN outlier detection

method, which belongs to the ﬁrst class, and which bases the conclusion of whether

is an outlier or not on the average distance from

to its

nearest neighbors. The

motivation for applying this method is two-fold: (i) an extensive empirical study [3] of

the unsupervised outlier detection methods concluded that the

NN method, despite

197

Paper I · A statistical view on a surrogate model for estimating extreme events with an

application to wind turbines

its simplicity, is a robust method that remains the state of the art when compared

across various datasets, and (ii) given that we already compute the distances to the

nearest neighbors to estimate

, the additional computational burden induced by

using the

NN outlier detection method is minimal. For more on outlier detection

methods, see [1, 3, 6, 19] and references therein.

Following the setup of Section 2.1, let

,... , x

be realizations of

under

and

denote by

(

) the set of indices corresponding to the

realizations closest to

with

respect to some metric d (such as the Euclidean or Mahalanobis distance). Then, for

observations x

,... , x

under P, the algorithm goes as follows:

(1) For i = 1,...,n compute the average distance from x

to its k nearest neighbors

j∈I

)

d(x

(2)

Obtain a sorted list

(1)

≤ ··· ≤

(n)

,... ,

and detect, e.g., by visual in-

spection, a point

at which the structure of the function

i 7→

(i)

changes

signiﬁcantly.

(3) Regard any x

with

≥

(j)

as an outlier.

Part (ii) of Condition 2.1 can usually not be checked, since we do not have any

realizations of

under

; this is similar to the issue of verifying the MAR assumption

in an incomplete data problem. Of course, if such realizations are available we can

estimate the conditional distribution of

given

under both

and

compare the results.

3 Application to extreme event estimation for wind turbines

In this section we will consider the possibility of estimating the distribution of the

10-minute maximum down-wind bending moment (load) on the tower top, middle

and base on an operating wind turbine from its

-minute operational statistics. The

data consists of

19976 10

-minute statistics from the turbine under normal operation

over a period from February

to September

, 2017. Since this particular turbine is

part of a measurement campaign, load measurements are available, and these will be

used to assess the performance of the surrogate model (see Figure 1 for the histogram

and CDF of measured loads).

To complement the measurements, a simulation tool is used to obtain

50606

simulations of both the operational statistics and the corresponding tower loads. We

choose to use the following eight operational statistics as covariates:

• Electrical power (maximum and standard deviation)

• Generator speed (maximum)

• Tower top down-wind acceleration (standard deviation)

• Blade ﬂap bending moment (maximum, standard deviation and mean)

• Blade pitch angle (minimum)

198

3 · Application to extreme event estimation for wind turbines

Figure 1:

Measured load distributions. Left and right plots correspond to histograms and CDFs, respec-

tively, based on

19976

observations of the tower top (ﬁrst row), middle (second row) and base (third row)

down-wind bending moments.

The selection of covariates is based on a physical interpretation of the problem

and by leaving out covariates which from a visual inspection (that is, plots of the

two-dimensional coordinate projections) seem to violate the support assumption

imposed in Condition 2.1(i). The loads and each of the covariates are standardized

by subtracting the sample mean and dividing by the sample standard deviation

(both of these statistics are computed from the simulated values). In the setup of

Section 2, this means that we have realizations of

X ∈ R

and

Y ∈ R

under both

and

(although the typical case would be that

is not realized under

). This gives

us the opportunity to compare the results of our surrogate model with the, otherwise

unobtainable, estimate (1.1) of P(Y ∈ · ).

In order to sharpen the estimate of

for covariates

close to the measured ones,

we discard simulations which are far from the domain of the measured covariates.

Eﬀectively, this is done by reversing the

NN approach explained in Section 2.2 as

we compute average distances from simulated covariates to the

nearest measured

covariates, sort them and, eventually, choosing a threshold that deﬁnes the relevant

simulations. We will use

= 1 and compute the sorted average distances in terms

of the Mahalanobis distance. The selection of threshold is not a trivial task and,

as suggested in Section 2.2, the best strategy may be to inspect visually if there

is a certain point, at which the structure of the sorted average distances changes

signiﬁcantly. To obtain a slightly less subjective selection rule, we use the following

ad hoc rule: the threshold is deﬁned to be

(τ)

, the

th smallest average distance,

where τ is the point that minimizes the L

distance

(f ,f

) B

|f (x) −f

(x)| dx (3.1)

between the function

that linearly interpolates (1

(1)

)

,... ,

(

m,d

(m)

) and

that

linearly interpolates (1

(1)

)

(

τ,d

(τ)

)

(

m,d

(m)

) over the interval [1

] (see the left plot

199

Paper I · A statistical view on a surrogate model for estimating extreme events with an

application to wind turbines

of Figure 2). This selection rule implies a threshold of 6.62 with

46100

, which in

turn implies that

4506

(

8.90 %

) of the simulations are discarded before estimating the

conditional load distributions. See the right plot of Figure 2 for a visual illustration

of the threshold selection. Of course, a more (or less) conservative selection rule can

be obtained by using another distance measure than (3.1).

Figure 2:

Blue curve: sorted distance from simulated covariates to nearest measured covariates. Left: linear

interpolation of (1

(1)

)

(

τ,d

(τ)

)

(

m,d

(m)

) with shaded region representing the corresponding

error for

48500

. Right: the orange curve is the normalised

error as a function of

and the dashed black lines

indicate the corresponding minimum and selected threshold.

The same procedure is repeated, now precisely as described in Section 2.2, to

detect potential outliers in the measurements. In this case,

= 10 is used since this

will be the same number of neighbors used to estimate

. The threshold is 2.45 with

18400

, and hence

1577

(

8.57 %

) of the measurements are found to be potential

outliers (see also Figure 3).

To assess which points that have been labeled as potential outliers, two-dimen-

sional projections of the outliers, inliers and simulations are plotted in Figure 4 (if

a point seems to be an outlier in the projection plot the original eight-dimensional

vector should also be labeled an outlier). To restrict the number of plots we only

provide 18 (out of 28) of the projection plots corresponding to plotting electrical

power (maximum), blade ﬂap bending moment (maximum) and generator speed

(maximum) against each other and all the remaining ﬁve covariates. The overall pic-

ture of Figure 4 is that a signiﬁcant part of the observations that seem to be outliers

is indeed labeled as such. Moreover, some of the labeled outliers seem to form a

horizontal or vertical line, which could indicate a period of time where one of the

inputs was measured to be constant. Since this is probably caused by a logging error,

such measurements should indeed be declared invalid (outliers).

Next, we would need to check if the distributional properties of the load can be

expected to change by removing outliers. In an incomplete data setup, the outliers

may be treated as the missing observations, and hence we want to assess whether

the Missing (Completely) at Random mechanism is in force (recall the discussion in

Remark 2.2). If the operation of removing outliers causes a signiﬁcant change in the

200

3 · Application to extreme event estimation for wind turbines

Figure 3:

The blue curve is the sorted distance from measured covariates to the 10 nearest simulated

covariates, the orange curve is the

error as a function of

, and the dashed black lines indicate the

corresponding minimum and selected threshold. All points with average distance larger than the threshold

are labeled possible outliers.

Figure 4:

Some of the two-dimensional projections of the covariates. Blue dots are simulations, orange

dots are inliers and green dots are potential outliers.

201

Paper I · A statistical view on a surrogate model for estimating extreme events with an

application to wind turbines

load distribution, then the outliers cannot be ignored and would need to be handled

separately. In Figure 5 the histograms of tower top, middle and base load obtained

from all measurements (the same as those in the three rows of Figure 1) are compared

to those where the outliers have been removed. It becomes immediately clear that

the distributions are not unchanged, since most of the outliers correspond to the

smallest loads of all measurements. However, it seems reasonable to believe that the

conditional distribution of the load given that it exceeds a certain (even fairly small)

threshold is not seriously aﬀected by the exclusion of outliers. Since the interest is

on the estimation of extreme events, that is, one often focuses only on large loads, it

may be suﬃcient to match these conditional excess distributions. Hence, we choose

to exclude the outliers without paying further attention to them. It should be noted

that, since the outlier detection method only focuses on covariates, it does not take

into account their explanatory power on the loads. For instance, it might be that a

declared outlier only diﬀers from the simulations with respect to covariates that do

not signiﬁcantly help explaining the load level. While this could suggest using other

distance measures, this is not a direction that we will pursue here.

Figure 5:

Histograms of measurements on tower top (left), middle (mid) and base (right) down-wind

bending moments. Measurements including and excluding outliers are represented in blue and orange,

respectively.

We will rely on

(2.1)

together with the two methods presented in Section 2.1 to

estimate the load distributions. The unsmoothed version of both methods (that is,

= 0) will be used, and for the

NN method we will choose

= 10. There are at least

two reasons for initially choosing the bandwidth

to be zero: (i) it can be a subtle

task to select the optimal bandwidth as there is no universally accepted approach,

and (ii) (ii) given that we have a fairly large dataset, most of the estimated values of

the CDFs should be fairly insensitive to the choice of bandwidth. In Figure 6 we have

plotted the empirical CDF of the loads (that is, the CDF of

(1.1)

based on measured

loads) together with the estimates provided by the

NN and random forest approach.

Since the loads are 10-minute maxima, it is natural to compare the CDFs to those of

GEV type (cf. the Fisher–Tippett–Gnedenko theorem). For this reason, and in order

to put attention on the estimation of the tail, we have also plotted the

−log

(

−log

(

))

202

3 · Application to extreme event estimation for wind turbines

transform of the CDFs. Recall that, when applying such a transformation to the CDF,

the Gumbel, Weibull and Fréchet distributions would produce straight lines, convex

curves and concave curves, respectively. From the plots it follows that, generally,

the estimated CDFs are closest to the empirical CDF for small and large quantiles.

Estimated

-quantiles tend to be smaller than the true ones for moderate values of

One would expect that, given only the eight covariates as considered here, a signiﬁcant

part of errors would be due to diﬀerences between the simulation environment and

the real-world environment. From an extreme event estimation perspective, the most

important part of the curve would be the last

10 %

20 %

corresponding to quantiles

above 0.8 or 0.9. On this matter, the

−log

(

−log

(

)) transform of the CDFs reveals that

the estimated CDFs have some diﬃculties in replicating the tail of the distribution

for middle and base load. However, since there are few extreme observations, this is

also the part where a potential smoothing (positive bandwidth) would have an eﬀect.

To test the smoothing eﬀect, we choose

according to Silverman’s rule of thumb,

that is,

= 1

06(

)

−1/5

, where

18399

is the number of measurements (without

outliers) and

is the sample standard deviation of the

load simulations (top,

middle or base) used for obtaining the

NN estimate of the given load distribution.

For details about this choice of bandwidth, and bandwidth selection in general, see

[16]. In Figure 7 we have compared the

−log

(

−log

(

)) transforms of the smoothed

estimates of the CDFs and the empirical CDF.

Figure 6:

Plots of CDFs (ﬁrst column) and the corresponding

−log

(

−log

(

)) transforms (second column)

of tower top (ﬁrst row), middle (second row) and base (third row) down-wind bending moments. The blue

curve is the empirical distribution of the measurements, and the orange and green curves are the

NN and

random forest predictions, respectively.

It seems that the smoothed versions of the estimated curves generally ﬁt the

tail better for the tower top and middle loads, but tend overestimate the larger

quantiles for the tower base load. This emphasizes that the smoothing should be

used with caution; when smoothing the curve, one would need to decide from which

point the estimate of the CDF is not reliable (as the Gaussian smoothing always will

dominate the picture suﬃciently far out in the tail). When no smoothing was used, the

uncertainty of the estimates was somewhat reﬂected in the roughness of the curves.

203

Paper I · A statistical view on a surrogate model for estimating extreme events with an

application to wind turbines

Figure 7:

Plots of

−log

(

−log

(

)) transforms of CDFs of tower top (left), middle (center) and base (right)

down-wind bending moments. The blue curve is the empirical distribution of the measurements, and

the orange and green curves are the smoothed

NN and random forest predictions, respectively, using

Silverman’s rule of thumb.

We end this study with Table 1 which compares some of the estimated quantiles with

the true (empirical) ones. From this table we see that the errors tend to be largest

for the

25 %

50 %

and

75 %

quantiles and fairly small for the

95 %

99 %

and

99.5 %

quantiles, which is in line with the conclusion based on Figure 6. Moreover, it also

appears that no consistent improvements of the tail estimates are obtained by using

the smoothed CDF estimates.

Table 1:

Some quantiles of the empirical load distributions and of the corresponding

NN and random

forest estimates.

Quantile (%)

kNN kNN Random forest Random forest Empirical

(smoothed) (smoothed)

Top −1.7349 −1.7344 −1.7941 −1.7315 −1.5528

Mid −1.4252 −1.4434 −1.3607 −1.2773 −1.1427

Base −1.4689 −1.4794 −1.4474 −1.3653 −1.3576

−0.7111 −0.7106 −0.9544 −0.8928 −0.3204

— 0.2181 0.2114 0.1587 0.2147 0.5002

0.1018 0.1152 0.0047 0.0547 0.2087

0.1643 0.1626 −0.0501 −0.0055 0.1991

— 1.1114 1.1076 1.1819 1.2192 1.5460

0.9407 0.9366 0.9978 1.0247 1.2192

0.6936 0.7122 0.6951 0.7414 0.7161

— 1.6855 1.7090 1.7283 1.7913 1.8670

1.6782 1.4653 1.4651 1.5184 1.4957

0.9611 0.9815 1.0068 1.0631 1.0271

— 1.8583 1.9383 1.9386 2.0385 1.9917

1.5877 1.6676 1.6245 1.7240 1.6179

99.5

1.0313 1.0687 1.0944 1.1522 1.1155

— 1.9180 2.0113 2.0195 2.1213 2.0418

1.6341 1.7337 1.6716 1.7910 1.6594

4 Conclusion

In this paper we presented a surrogate model for the purpose of estimating extreme

events. The key assumption was the existence of a simulation environment which

204

References

produces realizations of the vector (

X,Y

) in such a way that the conditional dis-

tribution of the variable of interest

equals the true one given a suitable set of

observable covariates

. It was noted that this corresponds to the Missing at Random

assumption in an incomplete data problem. Next, we brieﬂy reviewed the literature

on conditional probability estimation as this is the critical step in order to translate

valid simulations into an estimate of the true unconditional distribution of

. Fi-

nally, we checked the performance of the surrogate model on real data as we used

an appropriate simulation environment to estimate the distribution of the tower top,

middle and base down-wind loads on an operating wind turbine from its operational

statistics. The surrogate model seemed to succeed in estimating the tail of the load

distributions, but it tended to underestimate loads of normal size.

Acknowledgments

We thank James Alexander Nichols from Vestas Wind Systems A/S (Load & Control)

and Jan Pedersen for fruitful discussions. This work was supported by the Danish

Council for Independent Research (grant DFF–4002–00003).

References

[1]

Ben-Gal, I. (2005). “Outlier detection”. Data mining and knowledge discovery

handbook. Springer, 131–146.

[2] Breiman, L. (2001). Random forests. Machine learning 45(1), 5–32.

[3]

Campos, G.O., A. Zimek, J. Sander, R.J.G.B. Campello, B. Micenková, E. Schu-

bert, I. Assent and M.E. Houle (2016). On the evaluation of unsupervised

outlier detection: measures, datasets, and an empirical study. Data Min. Knowl.

Discov. 30(4), 891–927. doi: 10.1007/s10618-015-0444-8.

[4]

De Haan, L. and A. Ferreira (2007). Extreme value theory: an introduction.

Springer Science & Business Media.

[5]

Gilli, M. et al. (2006). An application of extreme value theory for measuring

ﬁnancial risk. Comput. Econ. 27(2-3), 207–228.

[6]

Hodge, V. and J. Austin (2004). A survey of outlier detection methodologies.

Artif. Intell. Rev. 22(2), 85–126.

[7]

Husmeier, D. (2012). Neural networks for conditional probability estimation:

Forecasting beyond point predictions. Springer Science & Business Media.

[8]

Hyndman, R.J., D.M Bashtannyk and G.K. Grunwald (1996). Estimating and

visualizing conditional densities. J. Comput. Graph. Statist. 5(4), 315–336. doi:

10.2307/1390887.

[9]

Little, R.J. and D.B. Rubin (2019). Statistical analysis with missing data. Vol. 793.

Wiley.

205

Paper I · A statistical view on a surrogate model for estimating extreme events with an

application to wind turbines

[10]

Neuneier, R., F. Hergert, W. Finnoﬀ and D. Ormoneit (1994). “Estimation of con-

ditional densities: A comparison of neural network approaches”. International

Conference on Artiﬁcial Neural Networks. Springer, 689–692.

[11]

Pedregosa, F., G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel,

M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, et al. (2011). Scikit-learn:

Machine learning in Python. J. Mach. Learn. Res. 12(Oct), 2825–2830.

[12]

Ragan, P. and L. Manuel (2008). Statistical extrapolation methods for estimating

wind turbine extreme loads. J. Sol. Energy Eng. 130(3), 031011.

[13]

Rosenbaum, P.R. and D.B. Rubin (1983). The central role of the propensity

score in observational studies for causal eﬀects. Biometrika 70(1), 41–55. doi:

10.1093/biomet/70.1.41.

[14]

Rubin, D.B. (1976). Inference and missing data. Biometrika 63(3). With com-

ments by R. J. A. Little and a reply by the author, 581–592. doi:

10.1093/biom

et/63.3.581.

[15]

Scheﬀer, J. (2002). Dealing with missing data. Res. Lett. Inf. Math. Sci. 3, 153–

160.

[16]

Silverman, B.W. (1986). Density estimation for statistics and data analysis. Mono-

graphs on Statistics and Applied Probability. Chapman & Hall, London. doi:

10.1007/978-1-4899-3324-9.

[17]

Smith, R.L. (1990). Extreme value theory. Handbook of applicable mathematics 7,

437–471.

[18]

Sugiyama, M., I. Takeuchi, T. Suzuki, T. Kanamori, H. Hachiya and D. Okano-

hara (2010). Least-squares conditional density estimation. IEICE T. Inf. Syst.

93(3), 583–594.

[19]

Zimek, A., E. Schubert and H.

P. Kriegel (2012). A survey on unsupervised

outlier detection in high-dimensional numerical data. Stat. Anal. Data Min.

5(5), 363–387. doi: 10.1002/sam.11161.

206