Python

Class Inheritance in Data Science

Class Inheritance in Data Science

Object-oriented programming and techniques (OOP) such as using classes and inheritance are common in many application programming environments but don't travel well outside computer memory. When considering data science tasks and objectives the transition from object hierarchies to data structures (and vice versa) is not always straightforward. In this short course we explore how some programming languages, data formats, database API's and web frameworks handle hierarchical classes.

Reading Time: 3 min.
Summary: In this short course we explore how some programming languages, data formats, database API’s and web frameworks handle hierarchical classes. Content: Object-oriented programming and techniques (OOP) such as using classes and inheritance are common in many application programming environments but alas don’t “travel well” outside computer memory. The potentially intricate relationships of objects (both the data they hold and the meaning and possible uses of the data) are not easy to transfer (except of-course by full replication of code and data).
Open Risk Hydra GSOC 2021 Credit Risk Project Wrap Up

Open Risk Hydra GSOC 2021 Credit Risk Project Wrap Up

Reading Time: 5 min.
The GSOC 2021 collaboration between Open Risk and the Hydra Ecosystem - Project Wrap-Up Google Summer of Code 2021 came and went amid the still ongoing worldwide pandemic experience. Open Risk was happy to join forces with the Hydra Ecosystem in exploring a proof-of-concept for next generation API’s using Hydra. The project aimed to guide students (here and here) to build a hypermedia enabled REST service that can serve standardized credit portfolio data.

Open Risk Mentoring GSOC 2021 Hydra Nextgen API Project

For the Google Summer of Code 2021 season Open Risk is happy to join forces with the Hydra Ecosystem to mentor a student project that aims to build a hypermedia enabled REST service around standardized credit portfolio data

Reading Time: 4 min.
A GSOC 2021 summer project collaboration between Open Risk and the Hydra Ecosystem Summer is underway and for the Google Summer of Code 2021 season Open Risk is happy to join forces with the Hydra Ecosystem. The project aims to guide students to build a hypermedia enabled REST service around standardized credit portfolio data. More specifically the project will build a REST service as backend for a hypothetical banking entity that collects and disseminates credit portfolio data conforming to an established public standard (the EBA NPL templates, see below).
Equinox: a Platform for Sustainable Project Finance Risk Management

Equinox: a Platform for Sustainable Project Finance Risk Management

On Earth Day 2021 we are happy to launch Equinox, an open source platform supporting sustainable project finance

Reading Time: 5 min.
Equinox is an open source platform that supports risk management and reporting of Project Finance. The platform integrates geospatial information with applicable regulatory and industry standards from EBA, PCAF and Equator Principles to provide a holistic view of the footprint of both individual projects and portfolios of project finance investments. Motivation Sustainability (understood in environmental, economic and social terms) is emerging as an undisputed constraint that will shape future human activity and more specifically how the financial system facilitates and empowers economic life.
Visual Overview of Built in Python Data Types

Visual Overview of Built in Python Data Types

We discuss the Python language built-in data types and a visualization that organizes them according to key attributes

Reading Time: 4 min.
Data Types are a fundamental building block of data science Data science is about data, but data are not simple and tame beasts. They have character and attitude, which can cause a lot of friction between them and the data scientist. There is a lot of sweat and tears involved when confronting data, but data scientists can do worse than know how to handle in particular Data Type quirks. Namaly a good fraction of data science involves not modelling data, not transforming data, not even cleaning data but simply goading data around the right containers, providing them with the right stage that fits their character.
An introduction to Semantic Python

An introduction to Semantic Python

A CrashCourse introduction to semantic data using Python covering a number of frameworks such as rdflib, owlready and pySHACL

Reading Time: 2 min.
Course Content: This CrashCourse is an introduction to semantic data using Python. It covers the following topics: We learn to work with RDF graphs using rdflib We explore the owlready package and OWL ontologies We look into json-ld serialization of RDF/OWL data We try data validation using pySHACL We use throughout a realistic data set based on the Credit Ratings Ontology Who Is This Course For: The course is useful to:
What do people talk about at FOSDEM 2021

What do people talk about at FOSDEM 2021

FOSDEM is the Free and Open Source Software Developers European Meeting.

Reading Time: 3 min.
Introduction: What is FOSDEM? FOSDEM is a non-commercial, volunteer-organized event centered on free and open-source software development (with a geographic focus on the European open source ecosystems / projects). FOSDEM is aimed at developers and anyone interested in the free and open-source software movement. It aims to enable developers to meet and to promote the awareness and use of free and open-source software. FOSDEM is held annually since 2001, usually during the first weekend of February, at the Université Libre de Bruxelles Solbosch campus in the southeast of Brussels, Belgium.
Logarithmic Sankey Visualization of Credit Migrations

Logarithmic Sankey Visualization of Credit Migrations

Sankey diagrams are very useful for the visualization of flows, especially when there is a conserved quantity. They can be tricky when some of the flows are much smaller than others. In the latest release of transitionMatrix we include an example of a log-scale version of Sankey

Reading Time: 5 min.
Using Sankey Diagrams: Sankey Diagrams are a type of flow diagram composed of interconnected arrows. The width of the arrows is proportional to the flow rate. Sankey diagrams are often used in physical sciences (physics, chemistry, biology) and engineering but also in economics. They can be used to represent the relative role and significance of various inputs and outputs in a given process. Sankey diagrams emphasize the major transfers within a system.
Back to School With the Open Risk Academy

Back to School With the Open Risk Academy

In the Back-to-School for 2020 we have more ways to access the Academy, new functionalities and more courses. In the rest of this post you will find a summary of the changes with pointers to further information where required

Reading Time: 4 min.
Risk Management will not be the same going forward: too much is at stake! The summer is over in the Northern Hemisphere - and what an unusual summer has it been! Worldwide the implications and challenges of adjusting to a Covid-19 pandemic are still a major issue, affecting individuals, companies and governments. At Open Risk we have been tracking and will continue to interpret the impact of the pandemic via a number of projects:
21 Ways to Visualize a Timeseries

21 Ways to Visualize a Timeseries

We explore a variety of distinct ways to visualize the same simple dataset

Reading Time: 26 min.
What this blog post is about (and what it isn’t): With the ever more widespread adoption of Data Science, defined as the intensive use of data in various forms of decision making, there is a renewed interest in Visualization as an effective channel for humans to understand data at various stages of the data lifecycle. There is a large variety of data visualization tools which can produce an ever more bewildering variety of visualization types
openNPL: Open Source NPL Platform - First Release

openNPL: Open Source NPL Platform - First Release

We introduce an open source platform that allows the easy management of non-performing loan data

Reading Time: 4 min.
Non-Performing Loans: The covid-19 crisis will certainly impact the concentration of Non-Performing Loans but given the special nature of this economic crisis compared (in particular) with the 2008 financial crisis it is unclear how precisely things will evolve. In a previous post and white paper (OpenRiskWP07_022616) we discussed the importance of advancing open and transparent methodologies for managing the risks associated with such credit portfolios. Effective management of NPL is also a top regulatory priority.
New Open Risk Academy Course: Introduction to Geojson

New Open Risk Academy Course: Introduction to Geojson

Reading Time: 2 min.
Course Content: This course is a CrashProgram (short course) introducing the GeoJSON specification for the encoding of geospatial features. The course is at an introductory technical level. It requires some familiarity with data specifications such as JSON and a very basic knowledge of Python Who Is This Course For: The course is useful to: Any developer or data scientist that wants to work with geospatial features encoded in the geojson format How Does The Course Help: Mastering the course content provides background knowledge towards the following activities:
What do people talk about at FOSDEM 2020

What do people talk about at FOSDEM 2020

FOSDEM means Free and Open Source Software Developers European Meeting

Reading Time: 4 min.
Introduction: FOSDEM is a non-commercial, volunteer-organized European event centered on free and open-source software development. It is aimed at developers and anyone interested in the free and open-source software movement. It aims to enable developers to meet and to promote the awareness and use of free and open-source software. FOSDEM is held annually since 2001, usually during the first weekend of February, at the Université Libre de Bruxelles Solbosch campus in the southeast of Brussels, Belgium.
Federated Credit Risk Models

Federated Credit Risk Models

Reading Time: 4 min.
The motivation for federated credit risk models: Federated learning is a machine learning technique that is receiving increased attention in diverse data driven application domains that have data privacy concerns. The essence of the concept is to train algorithms across decentralized servers, each holding their own local data samples, hence without the need to exchange potentially sensitive information. The construction of a common model is achieved through the exchange of derived data (gradients, parameters, weights etc).
Overview of the Julia-Python-R Universe

Overview of the Julia-Python-R Universe

We introduce a side-by-side review of the main open source ecosystems supporting the Data Science domain: Julia, Python, R, the trio sometimes abbreviated as Jupyter

Reading Time: 3 min.
Overview of the Julia-Python-R Universe: A new Open Risk Manual entry offers a side-by-side review of the main open source ecosystems supporting the Data Science domain: Julia, Python, R, sometimes abbreviated as Jupyter. Motivation A large component of Quantitative Risk Management relies on data processing and quantitative tools (aka Data Science ). In recent years open source software targeting Data Science finds increased adoption in diverse applications. The overview of the Julia-Python-R Universe article is a side by side comparison of a wide range of aspects of Python, Julia and R language ecosystems.
Open Source Securitisation

Open Source Securitisation

Reading Time: 5 min.
Open Source Securitisation: Motivation After the Great Financial Crisis securitisation has become the poster child of a financial product exhibiting complexity and opaqueness. The issues and lessons learned post-crisis were many, involving all aspects of the securitisation process, from the nature and quality of the underlying assets, the incentives of the various agents involved and the ability of investors to analyze the products they invested in. While the most egregious complications involved various types of re-securitisation and/or the interplay of structured credit derivatives undoubtedly even vanilla securitisation structure has a considerable amount of business logic.
Python versus R Language: A side by side comparison for quantitative risk modeling

Python versus R Language: A side by side comparison for quantitative risk modeling

Reading Time: 3 min.
Motivation for the comparison: A large component of risk management relies on data processing and quantitative tools. In turn, such information processing pipelines and numerical algorithms must be implemented in computer systems. Computing systems come in an extraordinary large variety but in recent years open source software finds increased adoption for diverse applications (machine learning, data science, artificial intelligence). In particular cloud computing environments are primarily based on open source projects at the systems level.
Version 0.4 of the Concentration Library adds geographic / industrial concentration functionality

Version 0.4 of the Concentration Library adds geographic / industrial concentration functionality

Reading Time: 1 min.
Release of version 0.4 of the Concentration Library adds Geographic / Industrial concentration indexes: Further building out the OpenCPM set of tools, we release version 0.4 of the Concentration Library, a python library for the computation of various concentration, diversification and inequality indices. The below list provides documentation URL’s for each one of the implemented classic indexes (the Hoover index is a new addition in this release Atkinson Index Hoover Index Concentration Ratio Berger-Parker Index Herfindahl-Hirschman Index Hannah-Kay Index Gini Index Theil Index Shannon Index Generalized Entropy Index Kolm Index An important new direction that appears first in this release is the introduction of indexes that measure geographical and industrial concentration.
Release of version 0.3 of the Concentration Library

Release of version 0.3 of the Concentration Library

Reading Time: 0 min.
Release of version 0.3 of the Concentration Library: Further building out the OpenCPM set of tools, we release version 0.3 of the Concentration Library. This python library for the computation of various concentration, diversification and inequality indices. The below list provides documentation URL’s for each one of the implemented indexes Atkinson Index Concentration Ratio Berger-Parker Index Herfindahl-Hirschman Index Hannah-Kay Index Gini Index Theil Index Shannon Index Generalized Entropy Index Kolm Index The image illustrates a simple use of the library where the HHI and Gini indexes are computed and compared for a range of randomly generated portfolio exposures.
Transition Matrix Library First Release

Transition Matrix Library First Release

Reading Time: 2 min.
Open Risk released version 0.1 of the Transition Matrix Library Motivation: State transition phenomena where a system exhibits stochastic (random) migration between well defined discrete states (see picture below for an illustration) are very common in a variety of fields. Depending on the precise specification and modelling assumptions they may go under the name of multi-state models, Markov chain models or state-space models. In financial applications a prominent example of phenomena that can be modelled using state transitions are credit rating migrations of pools of borrowers.