A Personal (and Professional) AI Journey
The long term impacts of AI may be even more radical than we first imagined. Indeed, the central thrust of the document that follows is that the very fabric of the financial services ecosystem has entered a period of reorganisation, catalysed in large part by the capabilities and requirements of AI.
The New Physics of Financial Services – How artificial intelligence is transforming the financial ecosystem
This is the second in a series of blog posts describing the personal and professional journey that myself and the Mclowd Team have embarked upon in terms of artificial intelligence (AI).
The first post sought to provide a base level of understanding regarding the key concepts which underpin AI.
In this post I will seek to build on this foundation, reflecting the content of a session which was run as part of our October Team Meeting.
As with the first post, the goal here is to illustrate that someone with only limited technical skills can harness the power of AI to achieve goals (commercial or non-commercial) that would otherwise be unrealistic.
Lean Start Up
The goal of that session was to identify a Minimum Viable Product (MVP) that could be targeted in terms of leveraging AI.
The purpose of an MVP approach is to drive validated learning, using a build, measure, learn feedback loop, as illustrated by the following diagram.
In the context of the Mclowd AI journey, Lean Start Up Principles provide the ideal methodology to go from a standing start to maximum velocity:
- In the shortest possible time
- With the least amount of resources
Context – The Future of Financial Services
The quote included at the start of this post is from the recent World Economic Forum / Deloitte paper on the impact which AI is having / will have on the financial services industry.
Along with that quote, the following table is a good summary of the research and its conclusions.
Mclowd Use Cases
In terms of MVP, there are numerous use cases that we could target:
- Data classification / matching
- Marketplace (recommendation engine)
- Technical support (chat bot)
- P2P analytics
The initial focus will be on replicating the workflow automation currently provided by incumbent software vendors. This approach also fits neatly with the upcoming launch of open banking in Australia.
Once this goal has been achieved (and our AI skillset upgraded), the residual use cases will be targeted.
Tools and Related Workflow
I’m aware that for many readers the idea that they would even consider embarking on an AI journey would be daunting.
However some of the tools available (such as Rapid Miner) are providing a level of abstraction that is more realistic for non-specialist users.
The analogy I would use is with web content management. I am quite capable of participating in the workflow associated with the Mclowd website without a deep understanding of html, because WordPress hides much of the complexity via a WYSIWYG interface.
Where opportunity cost (or other constraints) prevent me from doing so, I can simply go to the crowd for the technical skills required.
It is now just the same with AI.
So if you are up for the challenge, read on!
The Machine Learning Lifecycle
The following diagram illustrates the machine learning life cycle, from:
- Defining a business problem, to
- Deploying a model which can predict an outcome (to a required level of accuracy)
Defining the Business Problem
In MVP mode, the business problem which Mclowd is seeking to solve is simple:
How do we automate the creation of journal entries associated with bank transactions in order to dramatically reduce the workload associated with SMSF accounting?
In using machine learning to solve this problem we have several advantages:
- A historical data set which can be used to train and evaluate models
- A relatively structured data set (at least in MVP mode)
Types of Machine Learning
As per the following diagram, there are three main types of machine learning:
- Unsupervised learning
- Supervised learning
- Reinforcement learning
In the case of transaction automation, we will be applying supervised learning.
In supervised learning we will use historical data in order to train one or more algorithmic models to automatically classify transactions. (Initial supervision by Mclowd, until the requisite level of confidence has been achieved, with subsequent supervision / approval by individual users at a Fund / Firm level).
Framing the Machine Learning Problem
Defining a machine learning problem involves three key elements:
- Observations (records in the data set)
- Labels (a variable we are trying to predict)
- Features (of the data – in our case date, amount, description, etc)
Fortunately for Mclowd we are working with relatively structured data, as illustrated by the following sample data:
|56577000||30 June 2017||Interest paid||$112.74|
|56577000||1 July 2017||Credit interest||$12.00|
|56577000||3 July 2017||M0463 3F ZURICH LIFE 01711||$183.99|
|56577000||5 July 2017||DV191/00115375 NAB INTERIM DIV 0073||$1,189.02|
|56577000||6 July 2017||00121889952 WBC DIVIDEND 256||$1,267.52|
A Mclowd Methodology
The Mclowd methodology for framing draws a distinction between ‘classification’ and ‘matching’. The reasoning being that breaking down the exercise into smaller parts will make it easier to deliver an MVP solution (which will be focused on classification).
Classification is – in effect – a means of automating the creation of allocation rules for relatively straightforward ‘events’, such as interest, rent, and basic / recurrent expenses.
While the above is suitable as an MVP target, there are numerous examples of bank transactions that would not fit the above model (eg a franked dividend, or a distribution which is subject to withholding tax).
In addition to the above scenarios, there will be some bank transactions that don’t lend themselves to automation, as illustrated by the following screen shot:
These are not likely targets for automation, but represent a relatively small proportion of the workload – estimated at less than 10%.
Having framed both the business and machine learning problems, the next stage for us will involve:
- Data preparation
- Feature engineering
These steps will be the focus of our next post.