40 ISE Magazine | www.iise.org/ISEmagazine
A decision tree, a special form of tree diagram, has
been a popular tool among managers and profession-
als for a long time. They use it for making a correct
decision or to find a solution to an issue that arises
repeatedly. Since this tool is user-friendly, its use has
extended to the area of machine learning, aka deci-
sion tree analysis, and the reason for revisiting this tool now.
Uses for a decision tree include supporting the decision-
making process, finding a solution to a repeatable problem,
training computers from the data and developing a predictive
model, along with encoding the work rules that can be applied
automatically by computers. Using management tools and ana-
lytics has become a competitive advantage to businesses in this
era of technology revolution.
We will address the two main uses of decision tree through
two examples. The first is about a delay in delivering engineer-
ing projects, which was experienced by a global engineering
company. The second is about Chase Banks mortgage risk,
which has been excerpted from Eric Siegels book “Predictive
Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die.
In addition, we will discuss briey machine learning and its in-
tended use in predicting the future individuals behavior from
A
Grow a decision
tree to support
decision-making,
machine learning
Management tool can help solve recurring
problems in a systematic way
By Alaa Kafafi
August 2019 | ISE Magazine 41
past data (predictive analytics; see related article at right).
As we will see, creating the decision tree is simple. We de-
velop a statement of the goal or problem in question, then ask
sequential questions that lead to the next level of details until
we come to the correct decision or solution at the end point.
Our examples will show how the tree is drawn. Finally, we
will test the tree for its ability to lead to the correct decision in
different scenarios.
Supporting the decision-making process
Let’s see a hands-on example of using a decision tree to sup-
port the process of decision-making. Then, we will see that
the principles in this example are also valid for the process of
machine learning.
A delay in delivering a project usually has consequences for
both the client and contractors; the client usually has commit-
ments based on the date of a project’s completion. In the oil
and gas industry, for instance, the client was previously obli-
gated to supply gas, oil or other products to the customers just
after completion of the project. Accordingly, noncompletion
of the project on schedule will leave the client in bad shape.
Another example to cite is a delay in time-to-market for a
newly developed product that will put the manufacturer be-
hind the competitors.
We had a major engineering project that was suffering a
delay in schedule; the root cause was recognized as turnover
among a projects team members. This cause had been so
severe the project team was changed completely more than
once. The idea behind using the decision tree was to construct
a tool to support the project manager taking the proper action
when he or she noticed a pattern that could cause a delay in the
project delivery. Accordingly, to construct the decision tree a
team of managers and experienced engineers went through
many brainstorming sessions to find all the reasons that led to
a delay in the schedule for previous projects.
Figure 1 (on page 42) shows the problems delay in project
delivery and the four main causes: the project is understaffed,
frequent changes in the project scope, quality problems that
lead to rework and contingencies.
In the first main cause, employee turnover and other factors
led the project to be understaffed. It was obvious that policies
for retaining and motivating employees should be upgraded.
Besides the benefits offered to employees, we decided to hold a
team-building meeting every month. This tea party was given
to recognize and reward good performances.
In addition, we agreed on conducting an employee satisfac-
tion survey semi-annually to better understand the employees
perception of value. Finally, a succession plan for each project
was developed. In case a key team member left, a known re-
placement would take over smoothly.
Fluctuations in the need for resources was another contribu-
tor to the issue. In order to level the resources during the ex-
ecution of the project, we recommended to recruit contractors
during the peak need times along with engineers who were
between projects. The latter were needed to level the resources
through the entire company by starting different projects on a
staggered schedule. Resources are overloaded, a factor shown
in the same branch of the tree; when an engineer was working
on more than one project, his or her productivity was less than
an engineer devoted to a single project. Since coordination
with each project team consumed a big portion of the work-
ing time, we came to a consensus to limit the contribution of
a single engineer to only two projects.
The frequent change of the project’s scope (the second
branch of the tree) was another main cause to be off schedule.
We found that this cause was greatly attributed to miscom-
munications with the client. The remedy was to develop a
documented communication plan for each phase and to ensure
that by completing the front-end engineering design stage (a
higher-level design phase not affected much by changes) and
commencing the detailed engineering phase. By then, we al-
ready had agreed with the client on about 90% of the project’s
scope; the remaining 10% was considered as an allowance for
contingencies. In addition, we recommended to allocate spare
Applying predictive analytics
The practice of predictive analytics has many different problem-
solving applications in today’s business environment. Here are a
few ways it is being used, according to www.sas.com.
Detecting fraud. With cybersecurity becoming a growing
priority, companies can combine analytics methods for
pattern detection that can help prevent criminal behavior.
Such analytics can spot abnormalities in real time to weed out
potential threats.
Marketing campaigns. Customer preferences, responses
and purchases can be optimized, along with promotion of
cross-sell opportunities.
Improving operations. Companies can forecast inventory
needs and manage their resources more efficiently through
predictive analytics.
Reducing risk. Credit scores are an example of how
predictive analytics can assess all data relevant to a person’s
creditworthiness. Other uses include insurance claims and
collections.
42 ISE Magazine | www.iise.org/ISEmagazine
Grow a decision tree to support decision- making, machine learning
resources to meet minor changes in the
project’s scope.
The third branch of the tree was dealing
with quality problems that led to the proj-
ects reworking and drove it off schedule.
In case nonconformances were to stan-
dards, codes and regulations, a technical
review was the tool to find those design
errors and correct them. If the project
team was technically competent but
weak in process, then a quality audit was
the tool to disclose noncompliance to in-
ternal procedures (quality and operating
procedures) and complete corrective ac-
tions. We recommended to increase the
frequency of reviews or audits according
to the apparent nonconformances.
Also, project managers were advised to
hold “lunch ’n learn” sessions to train on
weak technical areas and ensure that all
the project team members went through
training on internal procedures. Training
on quality procedures was usually a part
of the onboarding training to the engi-
neers, while training on operational pro-
cedures was held just after forming the
project team.
Our quality audits to the projects re-
vealed a major cause for not complying
with the operating procedures: newcom-
ers who were usually from similar engi-
neering companies came to us with their
experience from the previous employers’
culture. They found that it was easy do-
ing the work the way they used to rather
than considering the culture of the new
employer. Indeed, the culture of the business is the reason be-
hind its existence and continuation; we cant impose the cul-
ture of another firm on our business even if this firm is a leader
in the industry. Accordingly, project managers were recom-
mended to monitor newcomers’ way of executing the work
and to educate them about the company’s culture.
One way to place countermeasures against contingencies
– the fourth main cause of the issue, which were events not
considered during the planning phase – was to refer to the les-
sons learned from similar projects to identify those events first.
We usually went through lessons learned at the planning phase
and placed countermeasures to the documented contingen-
cies from previous projects. Then when closing the project, we
started discussing and documenting the new lessons.
We found that holding a quarterly meeting to discuss the
unplanned events would help decrease the effects of contin-
gencies rather than waiting until completion of the project.
During these meetings, management and representatives from
other projects attended with the project team discussing what
went well and what did not go as planned and the appropri-
ate measures to keep the project on schedule. These meetings
were a good opportunity to share knowledge and spread good
practices among all the company’s projects. But the main ad-
vantage of increasing the frequency of these meeting was the
occurrence of unpredictable events (black swans) and we pre-
pared to better cope with them. Figure 1 shows the outcome
of the above analysis with 14 decisions in red.
As we saw from this example of a decision tree, when we go
deep in analysis by asking questions at each level of detail and
move from one level to another, the decisions taken may resolve
other problems than the one in question. For example, the suc-
cession plans developed were intended to solve the issue of em-
FIGURE 1
What made the project late?
The various causes for delay in delivering a project to the customer, with the
decisions made to seek a solution for each in the red type.
August 2019 | ISE Magazine 43
ployee turnover, but they also resolved the cases of sickness and
maternity leaves, which were considered contingencies. Also,
the policies for retaining and motivating project team members
could benefit the employees of the supporting functions.
This also illustrates that testing of the tree was successful.
When we developed the tree, the issue was a delay in schedule
and the cause was known (engineer turnover), then we added
many causes based on past experience. In each scenario, the
tree worked well and led the user to the appropriate decision.
Finally, we might miss a variable or more that affected the
issue under analysis, either because the variable was insig-
nificant or it was not repeatable, as the considered variables
were. However, as long as the tree led the user to the correct
decision at every scenario, this meant that testing of the tree
was successful and the tree will do well in general.
We could not calculate the monetary value of this exer-
cise. However, customer satisfaction surveys showed that
customers became more satisfied when seeing their projects
on schedule. In the following year after introducing this de-
cision tree, a potential client agreed to enter into a long-term
alliance with us through a service agreement. We then en-
joyed working with this customer without bidding, which
was an indication of the high loyalty of our clients.
When we performed this exercise, the team searched only
one database set, lessons learned, of one geographical area,
North America, and for the last three years. Suppose we ex-
tended the search to all project database sets of all geographi-
cal areas for the last 30 years in order to not miss any variable
that may affect the issue in question? Then the exercise be-
comes beyond human capabilities, but this kind of analysis is
common in the big data era in which we live. Here comes the
role of machines as we will see from the following example.
Since we will address the use of decision tree in the area
machine learning, it’s worth explaining this term briey.
Machine learning’s task is to find patterns that appear in
the data, so that what is learned will hold true in situations
never yet encountered. Accordingly, machine learning pro-
cesses training data to produce a predictive model. Then this
model takes the characteristics of the individual as input and
provides a predictive score as output. The higher the score,
the more likely it is the individual will exhibit the predicted
behavior.
In short, the predictive model is a mechanism that pre-
dicts the behavior of an individual, such as buying a product,
clicking an ad or prepaying a mortgage, as the following ex-
ample will show.
We can now dene the overarching technology within
which machine learning works: predictive analytics (PA),
technology that learns from experience (data) to predict the
future behavior of individuals in order to drive better deci-
sions. The alternative risk-oriented denition of PA is tech-
nology that learns from experience (data) to manage micro
risk. This illustrates the capabilities and limitations of PA
technology.
It is usually what individuals have done that predicts what
they will do, and people who have done something a lot are
more likely to do it again. So PA feeds demographic data about
the individuals gender, education, location, age, etc., with be-
havioral predictors such as frequency, purchases, financial ac-
tivity and product usage, such as calls and web surng. These
behaviors are often the most valuable; it’s always a behavior
that we seek to predict.
On the other hand, PA applications cannot be used to man-
age macro risks, for that reason; we could not know or pre-
vent occurrence of the global financial crisis in 2008 or other
black swan” events.
While the terms discussed are self-explanatory, the follow-
ing example will explain more these concepts.
Managing risk of mortgage prepayment
Chase Bank performed the following exercise to manage the
risk entailed with millions of mortgages sold to the clients. In
general, there are two kinds of risk: borrower default and pre-
payment of mortgage. The exercise was conned to the latter
type of risk, which prevented the bank from collecting the in-
terest throughout the whole period of the mortgage; with each
mortgage prepayment, the bank lost a profitable customer. An
accumulation of individuals who were motivated to prepay
their mortgage would cause huge losses to the bank.
The objective of the exercise was to identify the mortgages
that were at high risk of prepayment (within three months)
and mitigate that risk either by renancing or selling these to
another bank, a common practice between banks. The com-
puter was trained to do this task using a learning software uti-
lizing decision tree analysis. Lets see how learning from data
occurs and risk of prepayment is quantified.
Learning from data is an easy process, so let’s ask a ques-
tion, as we did in the previous example, to start drawing the
tree: What is the main factor that drives a client to prepay the
mortgage?
FIGURE 2
Analyzing prepayment risk
The risk of a customer paying off a mortgage early was markedly
different based on the interest rate of the loan.
44 ISE Magazine | www.iise.org/ISEmagazine
Grow a decision tree to support decision- making, machine learning
It was found from the data that if the mortgage interest rate
is under 7.94%, the risk of prepayment is 3.8%; otherwise, the
risk is 19.2%. Accordingly, the pool of mortgages was divided
into two groups (see Figure 2). This is logical since homeown-
ers paying a higher interest rate are more inclined to refinance
or sell than those paying a lower rate.
To continue drawing the tree and move from one level of
detail to another, we have to find another factor that best breaks
one of the two risk groups further into two subgroups that vary
in risk. Then we do the same thing with the other risk group
and keep going within the subgroups to divide and break down
to smaller and smaller groups. This learning method is called
decision trees, which is not the only method to create a predic-
tive model – other methods use regression, for example – but it
is popular and simple besides being effective.
Now let’s find a predictor variable that breaks the low risk
group down further. That decision tree software picks the
debtor’s income. As we see, the tree is growing and show-
ing that the mortgage holders income is very telling of risk.
The subgroup of mortgage holders for whom the interest rate
is under 7.94% and their incomes are under $78,223 so far is
the lowest-risk group identified, with only a 2.6% chance of
prepayment (Figure 3).
Moving to the right side of tree to further break down the
higher-risk group, the learning software selects mortgage size.
With only two factors considered, we have identified a risky
segment: higher-interest mortgages that are larger in magni-
tude show a 36% chance of prepayment (Figure 4).
Figure 5 shows the tree of Chase mortgage data after several
more learning steps (go left for “yes” and right for “no). This
has now discovered 10 distinct segments with risk levels rang-
ing from 2.6% all the way up to 40%.
Let’s now select one borrower and see
the probability of prepaying the mortgage
using this decision tree.
Borrower’s characteristics
Mortgage: $174,000
Property value: $400,000
Property type: single-family home
Interest rate: 8.92%
Annual income: $86,880
Net worth: $102,334
Credit score: strong
Late payments: 4
Age: 38
Marital status: Married
Education: college
Years at prior address: 4
Line of work: Business manager
Self Employed: No
Years at job: 3
Let’s start at the top of the tree and answer yes/no questions:
Q: Is the interest rate under 7.94?
A: No, go right
Q: Is the mortgage under $182,926?
A: Yes, go left
Q: Is the loan-to-value ratio under 87.4%
A: Yes, go left (the loan is less than 87.4%
of the property value)
Q: Is the mortgage under $67,751
A: No, go right
FIGURE 4
Higher rates, higher risk
The riskiest segment in the tree for mortgage prepayment are those with higher incomes
and higher interest rates.
FIGURE 3
Income as a key factor
The decision tree shows that the lower the homeowners’ income,
the less likely they are to pay off their mortgages early.
August 2019 | ISE Magazine 45
Q: Is the interest rate under 8.69%
A: No, go right.
The borrower comes to a landing in the segment with
25.6% propensity (follow the red arrows in Figure 5). The av-
erage risk overall is 9.4%, so this tells us there is a high chance
the borrower will prepay the mortgage.
Business rules are found along every path. We can drive
a rule that applies to the example borrower and other bor-
rowers who have similar characteristics. That rule is: if the
mortgage is greater than or equal to $67, 751 and less than
$182,926, and the interest rate is a greater than 8.69%, and the
loan-to-value ratio is less than 87.4%, then the probability of
prepayment this 25.6%.
The foregoing illustrates a new use for decision tree that is
encoding work rules, which are applied automatically by the
computer since we can form a work rule from every path of
the tree. The testing of the Chase’s tree for effectiveness of
the developed prediction model was carried out successfully,
as follows. In a sample of 22,000 cases employed in this ex-
ercise, 25% were held aside to test the model and 75% were
used to train the machine and develop the predictive model.
Then the model produced was tested and found to be doing
well on the test set. Therefore, this was a reasonable estima-
tion that the model would
do well in general.
Many trees were de-
veloped to cover different
categories of mortgages:
xed-rate, variable-rate, etc.
These decision trees were
integrated into the banks
systems. When this project
launched, the bank gained
an additional $600 million
in profit. The above predic-
tion model identified 74% of
mortgage prepayment before
they took place and guided
the management of mort-
gage portfolios to make the
correct decision in real time,
either through refinancing
or selling the mortgage. So,
Chase Bank not only suc-
cessfully decreased this risk
but also changed it into an
opportunity to make money.
Management tools and
analytics have been in use
for long time, but they have
come to the light again in
this era of technology revolution and become differentiation
tools for the businesses.
In addition to the classical uses of decision tree – support-
ing decision-making processes and finding a solution to a
repeated problem – the new uses include training computers
from data and developing a predictive model that predicts in-
dividuals’ future behavior. Besides encoding work rules that
applied automatically by the machine, since every path of the
developed tree is a work rule within the context of the issue
analyzed
This use of management tools and analytics has become a
competitive advantage in the era of the technology revolu-
tion in which we live.
Alaa Kafafi is a quality management and operational effectiveness
consultant with a bachelors degree in mechanical engineering and a
masters in fluid mechanics. He has led quality management efforts
with local and international oil and gas companies in the Middle East
(among them British Petroleum) and engineering companies in Cal-
gary, Canada. He was director of quality & HSE with Toyo Engi-
neering. A Six Sigma black belt, Kafafi practiced project management
and engineering of pipelines, as well as contributed to training on lean
Six Sigma.
FIGURE 5
Filling out the tree’s branches
The completed deicision tree for Chase mortgage data identifies 10 distinct segments with risk levels
ranging from 2.6% to 40%. (Follow the arrows to the left for “yes” and right for “no.”)