[SciPy-Dev] Stochastic calculus package

josef.pktd at gmail.com josef.pktd at gmail.com
Fri Jun 15 14:37:59 EDT 2018


Sounds good,

One thing that immediately jumps into my eyes are the lower case class names.
I think using pep-8 capital/camel case would make it look more modern.
Only old fashioned packages like numpy and scipy still have the legacy
spelling (when developers thought python has to look like matlab :)

Josef



On Fri, Jun 15, 2018 at 1:51 PM, Maurizio Cipollina
<maurizio.cipollina at gmail.com> wrote:
> Hi everybody,
>
> following up on this conversation of some time ago, a pre-release of the
> package, renamed SdePy, is now available on pip
> (https://pypi.org/project/sdepy  ) and as a project on github
> (https://github.com/sdepy/sdepy ), version 1.0.0rc3. A short quickstart
> guide (https://sdepy.readthedocs.io/en/latest/intro.html#id2 ) is the
> easiest path to see what it does and what it does not. Any feedback,
> comments and suggestions are very welcome!
>
> Since last September, the package has evolved quite a bit and had its fair
> share of refactoring, going through several 0.x unpublished versions before
> stabilizing in its current form (marked as beta). Here are a few takeaways I
> got from this thread, which I acted upon:
>
> -  Integrating user defined SDEs turned out to be a main point of interest:
> this ‘odeint’ like functionality has been considerably streamlined and
> simplified - formerly, preset processes were mainly in focus, and getting a
> new process up and running took a deep dive in a not-so-welcoming API, now
> upgraded as well (the former ‘integrator’ has been superseded by a
> ‘paths_generator’ parent class, and cooperating ‘integrator’ and ‘SDE’
> classes).
>
> -  The ‘kfunc’ beast was admittedly a hard sell: all undue dependencies of
> the core modules on its machinery have been eliminated, it plays now the
> role of an optional decorator for the shortcuts of frequently used stuff
> (it’s fully functional though and, in my experience, quite handy).
>
> -  The question “Do you REALLY need a ‘process’ class” clearly stood out in
> your comments: after some experimentation, the main reason for answering in
> the yes turned out, in my view, to be its seamless interoperability with the
> rest of the package, hardly achievable without a dedicated design. Now
> process instances can indeed be fed to integrators, both as time-dependent
> and/or path-dependent SDE parameters, and as stochasticity sources: this
> opened countless hacking opportunities when doing computations, and spoiling
> them seemed a pity.
>
> In case any of you is willing to take a look, feel free to get in touch for
> any problems/bugs/questions you might have
>
> Cheers
> Maurizio
>
> On Mon, 11 Sep 2017 at 19:39, <josef.pktd at gmail.com> wrote:
>>
>>
>>
>> On Mon, Sep 11, 2017 at 12:35 PM, Maurizio Cipollina
>> <maurizio.cipollina at gmail.com> wrote:
>>>
>>> Hi Ralf,
>>>
>>>
>>> thanks for going through this, here are some follow up comments on your
>>> points:
>>>
>>>
>>>
>>> On the two packages you found:
>>>
>>> Sdeint: provides several SDE integration algorithms but no specific
>>> processes; does not cover processes with jumps; offers no explicit
>>> provisions for handling solutions in a number of paths, which is what you
>>> need since the integration output is random variables, not numbers.
>>> SDE-higham is not a package or tool, but rather a collection of recipes
>>> for a few textbook examples (no jumps either), in the form that an
>>> interactive session might take (set globals, do calculations, plot and print
>>> results).
>>>
>>>
>>>
>>> About your questions/concerns:
>>>
>>> ndarray subclassing: the integrator does not depend on the process class,
>>> and process realizations may be easily tweaked to return an ndarray instead
>>> of a process instance; however, a main point of the package is not just to
>>> produce SDE solutions, but to allow the user to handle them (almost)
>>> effortlessly once obtained: so one might possibly hide the process class
>>> inside the testing suite, which depends on it, but hardly get rid of it
>>> altogether. Indeed, I subclassed ndarray as safely as I could figure out,
>>> but I will surely go through the scipy-dev archives and find out what
>>> advised you against doing so.
>>> odeint and scipy.integrate: my view is that odeint churns out numbers
>>> dependant on numbers, while SDE integration churns out random variables
>>> dependant on random variables (the stochasticity sources), so there is a
>>> shift of paradigm and the two schemes do not fit smoothly one within the
>>> other. The step by step integration of SDEs makes sense, and gets useful, if
>>> you can easily control the inputs (eg. correlations among stochasticity
>>> sources fed into different SDEs), generate the output in a decent number of
>>> paths, and easily extract statistics of expressions dependent on it (hence
>>> the flow ‘get a number of paths packaged as a process instance’ -> ‘do numpy
>>> algebra with it’ -> ‘estimate probability distributions etc. using the
>>> process class methods’ -> ‘scale it up with the montecarlo class in case the
>>> needed paths do not fit into memory’).
>>> Montecarlo simulations is a big topic indeed, that goes far beyond this
>>> package; what is called the montecarlo class is in fact a tool to cumulate
>>> statistical information extracted from a number of realizations of a random
>>> variable, and is focused on making easy to interpret the output of SDE
>>> integration.
>>> Black-Scholes assists one more tests in a rather extensive testing suite,
>>> and might be painlessly dropped.
>>>
>>>
>>>
>>> This said, I see your general feeling is that this whole subject is too
>>> specialized to fit into scipy at this stage. I suggested otherwise, in my
>>> proposal, because stochastic calculus is regarded as one of the pillars of
>>> modern probability theory, and covering it into scipy might be relevant to
>>> lots of mathematicians and physicists, both practitioners and students, that
>>> might otherwise hesitate to adopt, and build dependencies upon, a standalone
>>> package. In case you reconsider let me know, I’d be happy to help (now or in
>>> the future…).
>>>
>>>
>>>
>>> If you have further questions, let me know as well.
>>>
>>> Cheers
>>>
>>> Maurizio
>>
>>
>>
>> To partially follow up, similar to Ralph's view.
>>
>> I think it would be good to have this as a separate package, or at least
>> the code on github.
>> With distributions like conda it is now much easier to install additional
>> packages, once they are established enough and are included in distributions
>> or easily pip installable.
>>
>> The main reason that I think scipy is currently not the appropriate
>> package for it is that the design and review would require a lot of work. I
>> guess that there are not many maintainers and reviewers that are familiar
>> with this area. I also know only of applications in finance and it is not
>> clear whether or which parts would be of more general interest.
>>
>> In my opinion, some core tools for SDE and jump diffusions would fit in
>> scipy. But the application support will not or might not fit well in a
>> general purpose scientific numerical library. This would be similar to other
>> areas where scipy provides the core computational tools, but applications
>> are supported by other packages.
>>
>> For example, the process class might be similar in magnitude to scipy
>> stats distributions or signal statespace classes, but I have no idea what
>> design for it would fit in scipy.
>> Some of the time aggregation properties sound similar to the corresponding
>> pandas functions, and it is not clear whether users would want to use a
>> separate class or just use pandas instead.
>>
>> Similarly, the kfunc class sounds like something that doesn't have a
>> similar pattern in scipy.
>>
>> Design changes can be made more easily in a standalone package, while all
>> refactorings of classes in scipy are difficult issues because of the
>> backwards compatibility policy which requires that the design should be
>> largely settled before a merge.
>>
>> Josef
>>
>> https://groups.google.com/d/msg/pystatsmodels/xsttiEiyqAg/maR6n4EeAgAJ
>> (stackoverflow question has been deleted "Is there a library out there to
>> calibrate an Ornstein-Uhlenbeck model?")
>>
>>
>> https://github.com/statsmodels/statsmodels/blob/master/statsmodels/sandbox/tsa/diffusion.py
>> and diffusion2.py
>>
>>
>>
>>>
>>>
>>>
>>>
>>> On 9 September 2017 at 03:31, Ralf Gommers <ralf.gommers at gmail.com>
>>> wrote:
>>>>
>>>> Hi Maurizio,
>>>>
>>>>
>>>> On Fri, Sep 8, 2017 at 7:55 PM, Maurizio Cipollina
>>>> <maurizio.cipollina at gmail.com> wrote:
>>>>>
>>>>> Hi everybody, and first of all thanks for the great job you have been
>>>>> doing, on which I relied *A LOT* over time.
>>>>>
>>>>> Stochastic calculus and Stochastic Differential Equations are a
>>>>> significant branch of mathematics that to my understanding is
>>>>> underrepresented in scipy, as well as in the open source python stack at
>>>>> large. I may well be missing something, but as I can see one can find out
>>>>> there many recipies or collections of recipies, and a number of
>>>>> packages/projects skewed towards finance, but no structured framework for
>>>>> generating and handling stochastic processes and numerically solving SDEs.
>>>>
>>>>
>>>> I've never used these, but a quick search turns up two packages:
>>>>   https://pypi.python.org/pypi/sdeint
>>>>   https://github.com/alu042/SDE-higham
>>>> Not very mature probably, but would be good to compare your code with
>>>> those.
>>>>
>>>>>
>>>>> This is a pity, since scipy/numpy provide efficen implementations of
>>>>> all the basic ingredients needed to do the trick. Out of professional need,
>>>>> and prompted by this gap, I have developed a general purpose package of
>>>>> which I own the copyright, and which I would be happy to release under the
>>>>> BSD license as part of scipy, in case there is a consensus among the scipy
>>>>> community that it makes sense to do so (the package has no dependencies
>>>>> beyond standard library, numpy and scipy, and might probably fit best as a
>>>>> subpackage of scipy.stats).
>>>>>
>>>>> Before setting up a PR, I would be happy to hear your thoughts: I have
>>>>> pasted below the package docstring, that should give an overview of its
>>>>> scope and limitations. Please note that some of the realized processes stray
>>>>> into mathematical finance territory, somehow inevitably since finance is one
>>>>> major field of application of stochastic calculus, but the focus is the
>>>>> latter, not the former, at least in my intentions.
>>>>
>>>>
>>>> First thought: this looks like a useful addition to the scientific
>>>> Python ecosystem, but is probably too specialized for SciPy (at least at
>>>> this stage).
>>>>
>>>> Based on your docstring below, here are some questions/concerns:
>>>> 1. We definitely don't want a new ndarray subclass (there's a pretty
>>>> strong consensus at this point that subclassing ndarray is usually not a
>>>> good solution), so your `process` doesn't sound attractive.
>>>> 2. The `integrator` class sounds a lot like odeint, so maybe this would
>>>> fit better with scipy.integrate?
>>>> 3. Monte-Carlo simulation is a big and fairly technical topic. There are
>>>> whole packages (e.g. PyMC3, emcee) dedicated to this. It may not fit well
>>>> with SciPy.
>>>> 4. Things specific to finance like Black-Scholes and put/call options
>>>> are not a good fit.
>>>> 5. Maintainers for this added code. The ODE integrators in
>>>> scipy.integrate already suffer from lack of a dedicated maintainer, SDEs are
>>>> more specialized so are likely to not be very well-maintained by other devs
>>>> than you.
>>>>
>>>> Without seeing your code it's hard to say for sure, but it looks to me
>>>> like it would be better to release your code as a standalone package.
>>>>
>>>> Cheers,
>>>> Ralf
>>>>
>>>>
>>>>>
>>>>> Thanks in advance for taking your time to go through this, and for your
>>>>> comments and suggestions.
>>>>> Maurizio
>>>>>
>>>>> “””
>>>>> ===========================================================
>>>>> Stochastic - Monte Carlo simulation of stochastic processes
>>>>> ===========================================================
>>>>>
>>>>> This package provides:
>>>>>
>>>>> 1.  A `process` class, a subclass of `numpy.ndarray` representing
>>>>>     a sequence of values in time, realized in one or several paths.
>>>>>     Algebraic manipulations and ufunc computations are supported for
>>>>>     instances that share the same timeline and comply with numpy
>>>>>     broadcasting rules. Interpolation along the timeline is supported
>>>>>     via callability of `process` instances. Process-specific
>>>>> functionalities,
>>>>>     such as averaging and indexing along time or across paths, are
>>>>> delegated
>>>>>     to process-specific methods, attributes and properties (no
>>>>> overriding
>>>>>     of `numpy.ndarray` operations).
>>>>>
>>>>> 2.  The `source` class and its subclasses, stochasticity sources
>>>>> providing
>>>>>     numerical realizations of the differentials commonly found
>>>>>     in stochastic differential equations (SDEs), with or without
>>>>>     memory of formerly invoked realizations.
>>>>>
>>>>> 3.  The `integrator` class, a general framework for numerical SDE
>>>>>     integration, intended for subclassing, to assist the definition of
>>>>> step by
>>>>>     step integration algorithms and the computation of numerical
>>>>> realizations of
>>>>>     stochastic processes.
>>>>>
>>>>> 4.  Several objects providing realizations of specific stochastic
>>>>> processes
>>>>>     along a given timeline and across a requested number of paths.
>>>>>     Realizations are obtained via explicit formulae, in case the
>>>>> relevant SDE
>>>>>     has a closed form solution, or otherwise as `integrator` subclasses
>>>>> performing
>>>>>     Euler-Maruyama numerical integration.
>>>>>
>>>>> 5.  A `montecarlo` class, as an aid to cumulate the results of several
>>>>>     Monte Carlo simulations of a given stochastic variable, and to
>>>>> extract
>>>>>     summary estimates for its probability distribution function and
>>>>>     statistics, thus supporting simulations across a number of paths
>>>>> that exceeds
>>>>>     the maximum allowed by available memory.
>>>>>
>>>>> 6.  Several analytical results relating to the supported stochastic
>>>>>     processes, as a general reference and for testing purpouses.
>>>>>
>>>>> For all sources and realizations, process values can take any shape,
>>>>> scalar or multidimensional. Correlated multivariate stochasticity
>>>>> sources are
>>>>> supported. Time-varying process parameters (correlations, intensity of
>>>>> Poisson
>>>>> processes, volatilities etc.) are allowed whenever applicable.
>>>>> `process` instances act as valid stochasticity source realizations (as
>>>>> does
>>>>> any callable object complying with a `source` protocol), and may be
>>>>> passed as a source specification when computing the realization of a
>>>>> given
>>>>> process.
>>>>>
>>>>> Computations are fully vectorized across paths, providing an efficient
>>>>> infrastructure for simulating a large number of process realizations.
>>>>> Less so, for large number of time steps: integrating 1000 time steps
>>>>> across 100000 paths takes seconds, one million time steps across
>>>>> 100 paths takes minutes.
>>>>>
>>>>> General infrastructure
>>>>> ======================
>>>>> .. autosummary::
>>>>>    :toctree: generated/
>>>>>
>>>>>    process     -- Array representation of a process (a subclass of
>>>>>                   -- numpy.ndarray).
>>>>>    kfunc       -- Base class for functions with managed keyword
>>>>> arguments.
>>>>>    source      -- Base class for stochasticity sources.
>>>>>    true_source -- Base class for stochasticity sources with memory.
>>>>>    integrator  -- General framework for numerical SDE integration.
>>>>>    montecarlo  -- Summmary statistics of Monte Carlo simulations.
>>>>>
>>>>>
>>>>> Stochasticity sources
>>>>> =====================
>>>>> .. autosummary::
>>>>>    :toctree: generated/
>>>>>
>>>>>    wiener              -- dW, a source of standard Wiener process
>>>>> (Brownian
>>>>>                           -- motion) increments.
>>>>>    symmetric_wiener    -- as above, with symmetric paths (averages
>>>>> exactly 0).
>>>>>    true_wiener         -- dW, a source of standard Wiener process
>>>>> (Brownian
>>>>>                           -- motion) increments, with memory.
>>>>>    poisson             -- dN, a source of Poisson process increments.
>>>>>    compound_poisson    -- dJ, a source of compound Poisson process
>>>>> increments.
>>>>>                           -- (jumps distributed in time as a Poisson
>>>>> process,
>>>>>                           -- and in size as a given `scipy.stats`
>>>>> distribution).
>>>>>    compound_poisson_rv -- Preset jump size distributions for compound
>>>>> Poisson
>>>>>                           -- process increments.
>>>>>
>>>>>
>>>>> Stochastic process realizations
>>>>> ===============================
>>>>> .. autosummary::
>>>>>    :toctree: generated/
>>>>>
>>>>>    const_wiener_process          -- Wiener process (Brownian motion),
>>>>> with
>>>>>                                     -- time-independent parameters.
>>>>>    const_lognorm_process         -- Lognormal process, with
>>>>> time-independent
>>>>>                                     -- parameters.
>>>>>    wiener_process                -- Wiener process (Brownian motion).
>>>>>    lognorm_process               -- Lognormal process.
>>>>>    ornstein_uhlenbeck_process    -- Ornstein-Uhlenbeck process
>>>>> (mean-reverting
>>>>>                                     -- Brownian motion).
>>>>>    hull_white_process            -- F-factors Hull-White process (sum
>>>>> of F
>>>>>                                     -- correlated mean-reverting
>>>>> Brownian
>>>>>                                     -- motions).
>>>>>    hull_white_1factor_process    -- 1-factor Hull-White process (F=1
>>>>> Hull-White
>>>>>                                     -- process with F-index collapsed
>>>>> to a
>>>>>                                     -- scalar).
>>>>>    cox_ingersoll_ross_process    -- Cox-Ingersoll-Ross mean-reverting
>>>>> process.
>>>>>    full_heston_process           -- Heston stochastic volatility
>>>>> process
>>>>>                                     -- (returns both process and
>>>>> volatility).
>>>>>    heston_process                -- Heston stochastic volatility
>>>>> process
>>>>>                                     -- (stores and returns process
>>>>> only).
>>>>>    jump_diffusion_process        -- Jump-diffusion process (lognormal
>>>>> process
>>>>>                                     -- with compound Poisson jumps).
>>>>>    merton_jump_diffusion_process -- Merton jump-diffusion process
>>>>>                                     -- (jump-diffusion process with
>>>>> normal
>>>>>                                     -- jump size distribution).
>>>>>    kou_jump_diffusion_process    -- Kou jump-diffusion process
>>>>> (jump-diffusion
>>>>>                                     -- process with double exponential
>>>>>                                     -- jump size distribution).
>>>>>
>>>>> Shortcuts::
>>>>>
>>>>>    lognorm -- lognorm_process
>>>>>    oruh    -- ornstein_uhlenbeck_process
>>>>>    hwp     -- hull_white_process
>>>>>    hw1f    -- hull_white_1factor_process
>>>>>    cir     -- cox_ingersoll_ross_process
>>>>>    heston  -- heston_process
>>>>>    mjd     -- merton_jump_diffusion_process
>>>>>    kou     -- kou_jump_diffusion_process
>>>>>
>>>>>
>>>>> Analytical results
>>>>> ==================
>>>>> Exact statistics for the realized stochastic processes
>>>>> are listed below, limited to the case of constant process parameters
>>>>> and with
>>>>> some further limitations to the parameters' domains.
>>>>> Function arguments are consistent with those of the corresponding
>>>>> processes.
>>>>> Suffixes `pdf`, `cdf` and `chf` stand respectively for probability
>>>>> distribution
>>>>> function, cumulative probability distribution function, and
>>>>> characteristic
>>>>> function. Black-Scholes formulae for the valuation of call and put
>>>>> options have been
>>>>> included (prefixed with `bs` below).
>>>>>
>>>>> .. autosummary::
>>>>>    :toctree: generated/
>>>>>
>>>>>    wiener_mean
>>>>>    wiener_var
>>>>>    wiener_std
>>>>>    wiener_pdf
>>>>>    wiener_cdf
>>>>>    wiener_chf
>>>>>
>>>>>    lognorm_mean
>>>>>    lognorm_var
>>>>>    lognorm_std
>>>>>    lognorm_pdf
>>>>>    lognorm_cdf
>>>>>    lognorm_log_chf
>>>>>
>>>>>    oruh_mean
>>>>>    oruh_var
>>>>>    oruh_std
>>>>>    oruh_pdf
>>>>>    oruh_cdf
>>>>>
>>>>>    hw2f_mean
>>>>>    hw2f_var
>>>>>    hw2f_std
>>>>>    hw2f_pdf
>>>>>    hw2f_cdf
>>>>>
>>>>>    cir_mean
>>>>>    cir_var
>>>>>    cir_std
>>>>>    cir_pdf
>>>>>
>>>>>    heston_log_mean
>>>>>    heston_log_var
>>>>>    heston_log_std
>>>>>    heston_log_pdf
>>>>>    heston_log_chf
>>>>>
>>>>>    mjd_log_pdf
>>>>>    mjd_log_chf
>>>>>
>>>>>    kou_mean
>>>>>    kou_log_pdf
>>>>>    kou_log_chf
>>>>>
>>>>>    bsd1d2
>>>>>    bscall
>>>>>    bscall_delta
>>>>>    bsput
>>>>>    bsput_delta
>>>>>
>>>>> Notes
>>>>> =====
>>>>> To the benefit of interactive and notebook sessions, and as an aid to
>>>>> fluently
>>>>> freeze or vary the plurality of parameters that define each stochastic
>>>>> process,
>>>>> all sources, process realizations and analytical results are objects
>>>>> with managed keywords (subclasses of `kfunc`): if the corresponding
>>>>> classes
>>>>> are instantiated with both variables (non keyword arguments) and
>>>>> parameters
>>>>> (keyword arguments), they behave as functions returning the
>>>>> computations'
>>>>> result; if called with parameters only, return an instance that stores
>>>>> the set
>>>>> parameters, and exposes the same calling pattern (call with variables
>>>>> and optionally with parameters to get results, call with parameters
>>>>> only
>>>>> to get a new instance storing modified parameters).
>>>>> Parameters that are not specified fall back to the class defaults, if
>>>>> calling
>>>>> the class, or to the instance's stored values, if calling an instance.
>>>>> """
>>>>>
>>>>> _______________________________________________
>>>>> SciPy-Dev mailing list
>>>>> SciPy-Dev at python.org
>>>>> https://mail.python.org/mailman/listinfo/scipy-dev
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> SciPy-Dev mailing list
>>>> SciPy-Dev at python.org
>>>> https://mail.python.org/mailman/listinfo/scipy-dev
>>>>
>>>
>>>
>>> _______________________________________________
>>> SciPy-Dev mailing list
>>> SciPy-Dev at python.org
>>> https://mail.python.org/mailman/listinfo/scipy-dev
>>>
>>
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>


More information about the SciPy-Dev mailing list