MGMT 665: Business Analytics
Fall Semester 2022
· The purpose of this project is to apply business analytics techniques (descriptive, predictive, prescriptive) on data in spreadsheet, and demonstrate results that add business value.
· You can do the project as a group of 3-4 people, or you can do it individually.
· Select a Dataset (from your work), or from the
internet (open dataset) that you would like to perform the analytics on it.
When selecting the data, try to think about the business values that can be
generated by applying analytics to this data. Remember, the business
value-driven is the best orientation for applying analytics. If you find the
business value, then you can confidently select this data.
· he dataset should be in stored in Excel file,
and include at least 400 records (rows)
· You can use google dataset search (link below) to help you searching for dataset on the internet https://datasetsearch.research.google.com/
Elements of the project report
Section 1: Introduction (1 page)
·The introduction section should cover the following points:
oWhat is the topic of the data (for example, Sale, Energy, World cities, etc.)
o What is the domain of the data (for example, Finance, HR, Marketing, etc.)
oWhat is the business value that you would like to generate.
oWhat kind of business decisions that will be supported
Section 2: Information
management (0.5 page) (optional)
· If some parts of your dataset requires to do
some transformation (for example, standardization of some data like currency, or
date), explain what transformation that you did
Section 3: Descriptive
analytics technique (2 – 4 pages)
·This section should cover the following points:
oWhat type of the descriptive analytics techniques that will help you to understand your data. You can include more than one technique if needed
oFor each descriptive analytics technique:
§ Explain the reason why this technique is used
§ Present the result (graph or table) that is generated from this technique
Briefly explain the results
Section 4: Predictive analytics technique (2 – 4 pages)
·If the predictive technique is required in your case to generate the business value, then this section should cover the following points:
oWhat type of the predictive analytics techniques that will help you to get insight from your data. You can include more than one technique if needed.
oFor each predictive analytics technique (that are required for your case):
§ Explain the reason why this technique is used
§ Present the result (graph or table) that is generated from this technique
§ Briefly explain the results
· If the predicative technique is not required in your case, identify some scenarios (1 to 2) where you can apply predictive technique in your data. For each of the identified scenario:
§ Explain the scenario
§ Present the result (graph or table) that is generated from this technique
§ Briefly explain the results
Section 4: Prescriptive analytics technique (2 – 4 pages)
· If the prescriptive analytics technique is required for your case, then This section should cover the following points:
What type of the prescriptive analytics techniques that can precisely support the business decision (that you identified in Section 1 – in the last point). Only one technique will be enough. Once selected a prescriptive technique:
§ Explain the reason why this technique is used
§ Present the results that is generated from this technique
§ Briefly explain the results
· If the prescriptive technique is not required in your case, identify one scenario where you can apply prescriptive in your data (optional). If you can identify a scenario, then:
§ Explain the scenario
§ Present the results that is generated from this technique
§ Briefly explain the results
Section
1: Introduction
This paper deliberates data regarding a bank's marketing campaign. The dataset obtained for this study belongs to a continuum of domains including finance, sales, and marketing. It will help banks develop better marketing strategies that will generate high revenue. The underlying rationale for the data is to increase the effectiveness of the marketing campaigns. Bank customers have varying needs and thus require different services from banks. Misguided marketing can cause a bank to spend a significant amount of resources in terms of money and time and still not attain the intended objectives of increased revenue. Hence, the marketing needs to be targeted to potential clients to be effective.Analysis of this data will generate insight that will guide future marketing campaigns. First, it will help create clusters of clients with similar characteristics, such as job description and education level. Banks can target these clusters using different techniques and amenities. By doing so, banks will not waste valuable resources targeting a group with irrelevant plans and services. For instance, blue-collar individuals save in a different format than white-collar individuals since they receive their money regularly. They do not have to wait for the end month to receive their salaries. Thus, their savings culture and needs are different. Banks need to tailor their marketing for this group to augur well with their earnings. Besides, the data will provide meaningful information on the relationship between marital status, loans, and bank balance.Besides, the data will generate strategic value for banks as they will manage their resources effectively to attain the optimum output. This value will enable banks to gain a competitive edge in the industry. The dataset used is obtained from Kaggle. It has been downloaded over 30,000 times and viewed over 250,000 times.
Section 2: Information Management
The initial data set
comprised more than 11,000 records which was too much. As a result, some
transformation was made to the data to attain a reasonable number of entries.Data filtering was performed on pertinent columns including rating, reviews,
and installs. Conditional formatting was applied on all the columns based on a
predetermined measure that augured well with the aim of the research.
Extraction through advanced filtering was conducted on several columns that had
categorical data. The overarching lucidity for this action is that Excel does
not allow a descriptive analysis of non-numeric data. Missing values were
identified and populated with data before the analysis tests were conducted(Chatfield
& Haipeng Xing, 2019). The filtering was intended to bring the records to a
reasonable number above the required threshold of 400 records. The final pivot
table comprised 564 records and was attained by filtering the balance category.
Section 3: Descriptive Data Analytics
A frequency distribution was run on Excel. This test returns a tabular or graphical output that displays the precise number of observations on the categories along a measurement scale. This descriptive technique is important as it allows the researcher to glance at the entire data conveniently. Besides, it displays whether the observations are high or low and whether they are concerted or spread across the measurement scale(Hogg et al., 2019). The frequency distribution was run on relevant categories in the data set. Other descriptive statistics such as mean, median, mode, standard deviation, and variance were run on numerical categories.
Frequency Distribution
Job |
|||||
|
Frequency |
Percent |
Valid Percent |
Cumulative
Percent |
|
Valid |
admin. |
44 |
7.8 |
7.8 |
7.8 |
blue-collar |
64 |
11.3 |
11.3 |
19.1 |
|
entrepreneur |
16 |
2.8 |
2.8 |
22.0 |
|
housemaid |
11 |
2.0 |
2.0 |
23.9 |
|
management |
178 |
31.6 |
31.6 |
55.5 |
|
retired |
59 |
10.5 |
10.5 |
66.0 |
|
self-employ |
22 |
3.9 |
3.9 |
69.9 |
|
services |
26 |
4.6 |
4.6 |
74.5 |
|
student |
16 |
2.8 |
2.8 |
77.3 |
|
technician |
106 |
18.8 |
18.8 |
96.1 |
|
unemployed |
17 |
3.0 |
3.0 |
99.1 |
|
unknown |
5 |
.9 |
.9 |
100.0 |
|
Total |
564 |
100.0 |
100.0 |
|
Table 1: Job
Most of the customers targeted in the latest marketing
campaign by the bank were employees in management positions. This was a
deliberate effort because they have sufficient disposable income and are paid
via their accounts at the bank. Technicians were the second biggest group
targeted in the campaign. The next in line were employees in blue-collar and
admin positions. The latter receive payments regularly and do not have to wait
till the end of the month to receive their salaries(Johnson & Wichern,
2019). Hence, their financial needs are differentiated from salaried employees receiving
monthly payments.
Marital Status
Figure 1: Marital Status
Most of the clients
were married, implying that they had additional bills to support their families.
Hence, they are more likely to take loans, which are banks' primary income sources.
Single individuals were the second, followed by divorced persons.
Education |
|||||
|
Frequency |
Percent |
Valid Percent |
Cumulative
Percent |
|
Valid |
primary |
79 |
14.0 |
14.0 |
14.0 |
secondary |
198 |
35.1 |
35.1 |
49.1 |
|
tertiary |
258 |
45.7 |
45.7 |
94.9 |
|
unknown |
29 |
5.1 |
5.1 |
100.0 |
|
Total |
564 |
100.0 |
100.0 |
|
Table 2: Education
Most of the clients had attained tertiary education,
which implies that they were knowledgeable of the terms of the loans marketed
to them by the bank. This population could make an informed choice regarding
the products marketed to them. These clients had better jobs, implying better
salaries that could translate to higher loans. Their savings were also
considerably higher compared to individuals that only attained secondary
education.
Loan |
|||||
|
Frequency |
Percent |
Valid Percent |
Cumulative
Percent |
|
Valid |
no |
534 |
94.7 |
94.7 |
94.7 |
yes |
30 |
5.3 |
5.3 |
100.0 |
|
Total |
564 |
100.0 |
100.0 |
|
Table 3: Loans
Most of the clients did not have an active loan. This implies that they could take new loans against their assets or salaries. Current loans often disqualify clients from taking additional loans until they are cleared. However, many clients with loans are likely to take more loans before or after repaying them. Hence, they form a good basis for a marketing campaign. Clients that are yet to take loans can be enticed by various strategies that will be discussed in the predictive analytics section.
Figure 2: Outcome from previous campaigns
The outcome of the previous campaigns was largely
unknown. This implies that the bank did not set smart metrics to indicate
whether the marketing campaign was successful. In the known cases, failure and
success were approximately the same, meaning the marketing campaign was ineffective(Hays,
2018). Advancing technology has now made it possible to track the effectiveness
of marketing campaigns. Banks can leverage this aspect to determine if they
need to restructure their campaigns to attain augmented outcomes.
Contact |
|||||
|
Frequency |
Percent |
Valid Percent |
Cumulative
Percent |
|
Valid |
cellular |
417 |
73.9 |
73.9 |
73.9 |
telephone |
57 |
10.1 |
10.1 |
84.0 |
|
unknown |
90 |
16.0 |
16.0 |
100.0 |
|
Total |
564 |
100.0 |
100.0 |
|
Table 4: Contacts
The majority of the clients could be reached via
cellular. This population could thus be reached through email and internet
marketing via their smartphones. Clients reachable via telephone could be
reached through telephone marketing, whereas a combination of different
marketing techniques could reach the unknown. Banks must use the preferred
communication channels for the clients as this will increase the likelihood of
engagement. Even though the clients may have several communication platforms,
banks can identify the most frequently used. Subsequently, they can use this
channel to carry out the campaigns.
Month |
|||||
|
Frequency |
Percent |
Valid Percent |
Cumulative
Percent |
|
Valid |
Apr |
42 |
7.4 |
7.4 |
7.4 |
Aug |
80 |
14.2 |
14.2 |
21.6 |
|
Dec |
8 |
1.4 |
1.4 |
23.0 |
|
Feb |
37 |
6.6 |
6.6 |
29.6 |
|
Jan |
10 |
1.8 |
1.8 |
31.4 |
|
Jul |
49 |
8.7 |
8.7 |
40.1 |
|
Jun |
61 |
10.8 |
10.8 |
50.9 |
|
Mar |
25 |
4.4 |
4.4 |
55.3 |
|
May |
107 |
19.0 |
19.0 |
74.3 |
|
Nov |
90 |
16.0 |
16.0 |
90.2 |
|
Oct |
31 |
5.5 |
5.5 |
95.7 |
|
Sep |
24 |
4.3 |
4.3 |
100.0 |
|
Total |
564 |
100.0 |
100.0 |
|
Table 5: Month
The above table displays the number of clients who
responded to the marketing campaign each month. Most of them responded in May,
November, and August. May is just after the Easter celebrations, and August is
before the start of a new semester. During both periods, the clients needed
money to pay various bills. Particularly, they may need this money to pay for
bills such as school fees or vacations.
Descriptive Statistics
Table
6: Age
The mean age of the clients was 44.4, and the median was 42 years. This age represents a group of individuals that have worked for over fifteen years. They have likely made investments or are about to make investments. Thus, they need bank loans to fund these investments. Besides, they have acquired significant savings that they can combine with their loans to fund their desired projects.
balance |
|
Mean |
11516.8 |
Standard
Error |
349.1243 |
Median |
8851.5 |
Mode |
8304 |
Standard
Deviation |
8291.243 |
Sample
Variance |
68744716 |
Kurtosis |
25.06537 |
Skewness |
4.223749 |
Range |
75200 |
Minimum |
6004 |
Maximum |
81204 |
Sum |
6495473 |
Count |
564 |
Table 7: Balance
The average
balance of the clients was $11,516.8, whereas the mode and median were 8304 and
8851.5, respectively. The range of the balance was large, which signifies that
the clients have varying financial abilities and thus require customized
services(Johnson & Wichern, 2019). Correspondingly, the amount of loans
they can take is different. Customers with higher balances can take bigger
loans due to the transactions that offer them a higher threshold.
campaign |
|
Mean |
2.547872 |
Standard
Error |
0.146336 |
Median |
2 |
Mode |
1 |
Standard
Deviation |
3.475289 |
Sample
Variance |
12.07763 |
Kurtosis |
165.2361 |
Skewness |
10.39447 |
Range |
62 |
Minimum |
1 |
Maximum |
63 |
Sum |
1437 |
Count |
564 |
Table 8: Campaign
The average time for the previous campaign was 2.5 days, while the median and mode were 2 and 1, respectively. Likewise, the range was large, indicating that the marketing campaigns varied significantly. Some were simple to design and implement and took fewer days, whereas others were intricate, taking more days. For instance, telephone marketing can take place in a day because it is simple to call clients that meet a predetermined criterion. On the other hand, internet marketing may take longer since the clients have to be divided into clusters and different avenues used. Campaigns that last for extended periods are likely to reach a bigger audience.
Section 4: Predictive Business Analytics
The forecasting of the future value of balance is a pertinent aspect of this analytical exercise. The purpose is to utilize the existent historical data in the prediction of future value for appropriate planning. Forecasting provides an estimate to any organization to be able to develop a reliable budget and strategy to mitigate the expectations given the information obtained from history (Johnson & Wichern, 2019). Therefore, this section will circumspectly outline the forecasting of balance using two methods namely exponential smoothing and moving average.
Moving
Average
The
simple moving average avails the forecast value based on the formula below: -
Whereby t is the current period, is the forecast for the next
period, and n is the forecasting window also known as in n -point moving
average. Therefore, if n is equal to 2, the moving average aggregates two observations
back and finds the average within that window (Roderick & Rubin, 2019). A
is the actual figures given for each period.This report employs a three-point moving average to obtain
the forecast of the Sales figures. The equation is as above with n being equal
to 3 for every window throughout the dataset.
The plot of the moving average is outputted below.
|
Forecast |
Standard Error |
|
6263 |
2557.918 |
|
6343.5 |
18.12406 |
Exponential
Smoothing
Exponential smoothing is also a forecasting technique that
is slightly different from the moving average function. The equation of the
exponential smoothing looks like the one below: -
The moving average expresses equal bias on each observation on a particular window. The exponential smoothing function contains a smoothing constant which expresses how much the forecast will react to the observed differences(Chatfield & Haipeng Xing, 2019). The higher the value of the smoothing constant the higher the reaction to differences. Choosing a smoothing constant of 0.2 yields the forecast below.
The plot is outputted below:
-
|
Forecast |
Standard error |
|
6481.343 |
5990.94 |
|
6289.869 |
1524.62 |
Regular exponential smoothing will always lag behind the trend and sometimes it ought to be incorporated into the forecasting using the technique. Therefore, the forecast for July and August based on the data is given by the model, and the values are shown above. This state, therefore, indicates that the prediction is based on, historical data
Conclusion
The paper therein has entailed the analysis of the marketing campaign of a bank. The variables associated therein inclusive of the outcome of the campaign, the bank balance of the customers as well as the time have been instrumental. Moreover, the marital status, and the time required for previous campaigns. The average age of clients is also pertinent for the understanding of the impact of the marketing campaigns. The descriptive statistics associated with the variables have been comprehensively exposed to facilitate the potent understanding of the contents of the dataset about marketing in banks.
The predictive analytics furthermore has built on the descriptive knowledge assembled from the descriptive analytics to understand any projections. The variable under analysis was the client’s bank balance regarding the marketing campaign’s impact. Two-time series methods have been employed for forecasting. The moving average model considered the average of the bank balance over the three-year window and the projection stood at 6293 with a standard deviation of 2557.92. the range of this projection is also subject to the addition or deduction of the standard deviation from the projection. The exponential smoothing forecast stands at 6481.34 with a standard error of 5990.94. this latter forecast is a wide range of values as the minimum can go to below a thousand dollars in bank balance while the maximum can go to more than eleven thousand dollars. These forecasts indicate the wide range of values for possible bank balance for any customer. Therefore, the descriptive and predictive analytics of this dataset has exposed a variety of information about the balance of customer deposits in the bank about the marketing campaign.
References
Chatfield, C., & Haipeng Xing. (2019).
The analysis of time series : an introduction with R. Crc Press,Taylor
& Francis Group.
Hays, W. L. (2018). Statistics.
Wadsworth/Thomson Learning.
Hogg, R. V., Tanis, E. A., &
Zimmerman, D. L. (2019). Probability and statistical inference. Pearson.
Johnson, R. A., & Wichern, D. W.
(2019). Applied multivariate statistical analysis. Pearson.
Roderick, & Rubin, D. B. (2019). Statistical
Analysis with Missing Data. John Wiley & Sons.