{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Problem Statement\n", "\n", "Using the collected from existing customers, build a model that will help the marketing team identify potential customers who are relatively more likely to subscribe term deposit and thus increase their hit ratio." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Solution" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### General and domain knowledge assumption" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This problem statement relates to banking and financial sector. For time being let us forget about data set(though we know source and content) and assumes few things.\n", "\n", "What can be the people's age in data set? We do not know from which region of the world this data belongs. But since it is bancking and financial domain we would have verified & authenticated users having a min age of 18 or 21 years age mostly and upper limit would be around ~100.\n", "\n", "Knowing from the past experience of working with banking data set we know that their experience, salary, loan, cc expenditure are some inputs what we can expect to encounter in new data set and can heavily weight on the outcome of output variable which we need to predict.\n", "\n", "We also need to consider the profession of an individual whom we are considering as input data. A person with high income usually invest in more than one financial domain but still has a good change of being among the people appling for deposit.\n", "\n", "People with low and mid level of income range are very particular about investment and tend to trust banks more rather than investing in other places but as we do encounter outliers in our data set, there are certain inputs in this group of people that would still go and invest in places other than banks. Usually risk takers.\n", "\n", "Our final outcome would be predecition for an individual whether he would be interested in term deposit or not, but why are we takling too much about investment. Well, there is inverse relation between investment and term deposit. Its a contradiction, deposit is also as investment, but if an individual is investing more on other investment plans than naturally his investment in term deposit would be fairly less." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Existing Algorithms and approaches" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since it is binary prediction problem based on number of input we already have few approaches in mind like NB Classifier, kNN. Logistic regression also seems a good fit for this. We have little more dimensions to consider 17+ we can even consider random fores with variable and random dimensions." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### General Imports" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAWAAAABICAYAAADI6S+jAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjMsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+AADFEAAACEklEQVR4nO3aMWpUYRSG4XMdISkUhSQgKChCOjunt7SycQc2swo7FyBYCLoBF6M2Iti4AhPsRBDk2NgY1DAwv9/M9Xm6udziO81bXGbq7gLg37uQHgDwvxJggBABBggRYIAQAQYIEWCAkIvnvTBN06qqVlVVe4v9u9eu3Bg+KuXk0n56wlCHiy/pCcNc/XySnjDUx95LTxiqry/SE4b6+uHTaXcfnX0+rfM/4FsHx/34/rONDtsmL+4dpycM9ejy6/SEYR6+ep6eMNSD77fTE4b69uQgPWGod8unb7t7efa5TxAAIQIMECLAACECDBAiwAAhAgwQIsAAIQIMECLAACECDBAiwAAhAgwQIsAAIQIMECLAACECDBAiwAAhAgwQIsAAIQIMECLAACECDBAiwAAhAgwQIsAAIQIMECLAACECDBAiwAAhAgwQIsAAIQIMECLAACECDBAiwAAhAgwQIsAAIQIMECLAACECDBAiwAAhAgwQIsAAIQIMECLAACECDBAiwAAhAgwQIsAAIQIMECLAACECDBAydfffX5imVVWtfv68U1XvR48KOqyq0/SIQeZ8W5X7dt3c77vZ3UdnH54b4F9enqY33b3c6KwtMuf75nxblft23dzv+xOfIABCBBggZN0AvxyyYnvM+b4531blvl039/t+a61vwABsjk8QACECDBAiwAAhAgwQIsAAIT8AsnxTG0hUjfsAAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "#Import Necessary Libraries\n", "\n", "# NumPy: For mathematical funcations, array, matrices operations\n", "import numpy as np \n", "\n", "# Graph: Plotting graphs and other visula tools\n", "import pandas as pd\n", "import seaborn as sns\n", "\n", "# sns.set_palette(\"muted\")\n", "# sns.set(color_codes=True)\n", "# sns.color_palette(\"colorblind\", 10)\n", "\n", "\n", "# color_palette = sns.color_palette()\n", "# To enable inline plotting graphs\n", "import matplotlib.pyplot as plt\n", "%matplotlib inline\n", "\n", "\n", "# palette = sns.color_palette(\"muted\")\n", "\n", "# sns.set_palette(palette)\n", "\n", "# sns.palplot(palette)\n", "\n", "flatui = [\"#9b59b6\", \"#3498db\", \"#95a5a6\", \"#e74c3c\", \"#34495e\", \"#2ecc71\"]\n", "\n", "sns.set_palette(flatui)\n", "\n", "sns.palplot(sns.color_palette())" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# Custom terminal printer\n", "\n", "#Lets try to print unique values from object data type\n", "from IPython.display import Markdown, display\n", "\n", "def printTextAsMarkdown(title, content, color=None):\n", " if title is None:\n", " colorStr = \"{}\".format(color, content)\n", " else: \n", " colorStr = \"**{}** : {}\".format(color, title, content)\n", " \n", " display(Markdown(colorStr))" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Total Colums in dataframe: 17\n", "Columns list ['age', 'job', 'marital', 'education', 'default', 'balance', 'housing', 'loan', 'contact', 'day', 'month', 'duration', 'campaign', 'pdays', 'previous', 'poutcome', 'Target']\n", "***********************************************************************************************************************\n", "Columns Map {0: 'age', 1: 'job', 2: 'marital', 3: 'education', 4: 'default', 5: 'balance', 6: 'housing', 7: 'loan', 8: 'contact', 9: 'day', 10: 'month', 11: 'duration', 12: 'campaign', 13: 'pdays', 14: 'previous', 15: 'poutcome', 16: 'Target'}\n" ] } ], "source": [ "# Load data set\n", "# Import CSV data using pandas data frame\n", "df_original = pd.read_csv('bank-full.csv')\n", "\n", "# Print total columns\n", "print(\"Total Colums in dataframe: \", len(df_original.columns))\n", "\n", "# Prepare columns names\n", "df_original_columns = []\n", "for column in df_original.columns:\n", " df_original_columns.append(column)\n", "\n", "\n", " \n", "print(\"Columns list {}\".format(df_original_columns))\n", "print(\"***********************************************************************************************************************\")\n", "\n", "# Prepare mapping of column names for quick access\n", "df_original_columns_map = {}\n", "map_index: int = 0\n", "for column in df_original_columns:\n", " df_original_columns_map[map_index] = column\n", " map_index = map_index + 1\n", " \n", "print(\"Columns Map {}\".format(df_original_columns_map))\n", "\n", "# We have separated out columns and its mapping from data, at any point of time during data analysis or cleaning we \n", "# can directly refer or get data from either index or column identifier\n", "\n" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
agejobmaritaleducationdefaultbalancehousingloancontactdaymonthdurationcampaignpdayspreviouspoutcomeTarget
058managementmarriedtertiaryno2143yesnounknown5may2611-10unknownno
144techniciansinglesecondaryno29yesnounknown5may1511-10unknownno
233entrepreneurmarriedsecondaryno2yesyesunknown5may761-10unknownno
347blue-collarmarriedunknownno1506yesnounknown5may921-10unknownno
433unknownsingleunknownno1nonounknown5may1981-10unknownno
535managementmarriedtertiaryno231yesnounknown5may1391-10unknownno
628managementsingletertiaryno447yesyesunknown5may2171-10unknownno
742entrepreneurdivorcedtertiaryyes2yesnounknown5may3801-10unknownno
858retiredmarriedprimaryno121yesnounknown5may501-10unknownno
943techniciansinglesecondaryno593yesnounknown5may551-10unknownno
1041admin.divorcedsecondaryno270yesnounknown5may2221-10unknownno
1129admin.singlesecondaryno390yesnounknown5may1371-10unknownno
1253technicianmarriedsecondaryno6yesnounknown5may5171-10unknownno
1358technicianmarriedunknownno71yesnounknown5may711-10unknownno
1457servicesmarriedsecondaryno162yesnounknown5may1741-10unknownno
1551retiredmarriedprimaryno229yesnounknown5may3531-10unknownno
\n", "
" ], "text/plain": [ " age job marital education default balance housing loan \\\n", "0 58 management married tertiary no 2143 yes no \n", "1 44 technician single secondary no 29 yes no \n", "2 33 entrepreneur married secondary no 2 yes yes \n", "3 47 blue-collar married unknown no 1506 yes no \n", "4 33 unknown single unknown no 1 no no \n", "5 35 management married tertiary no 231 yes no \n", "6 28 management single tertiary no 447 yes yes \n", "7 42 entrepreneur divorced tertiary yes 2 yes no \n", "8 58 retired married primary no 121 yes no \n", "9 43 technician single secondary no 593 yes no \n", "10 41 admin. divorced secondary no 270 yes no \n", "11 29 admin. single secondary no 390 yes no \n", "12 53 technician married secondary no 6 yes no \n", "13 58 technician married unknown no 71 yes no \n", "14 57 services married secondary no 162 yes no \n", "15 51 retired married primary no 229 yes no \n", "\n", " contact day month duration campaign pdays previous poutcome Target \n", "0 unknown 5 may 261 1 -1 0 unknown no \n", "1 unknown 5 may 151 1 -1 0 unknown no \n", "2 unknown 5 may 76 1 -1 0 unknown no \n", "3 unknown 5 may 92 1 -1 0 unknown no \n", "4 unknown 5 may 198 1 -1 0 unknown no \n", "5 unknown 5 may 139 1 -1 0 unknown no \n", "6 unknown 5 may 217 1 -1 0 unknown no \n", "7 unknown 5 may 380 1 -1 0 unknown no \n", "8 unknown 5 may 50 1 -1 0 unknown no \n", "9 unknown 5 may 55 1 -1 0 unknown no \n", "10 unknown 5 may 222 1 -1 0 unknown no \n", "11 unknown 5 may 137 1 -1 0 unknown no \n", "12 unknown 5 may 517 1 -1 0 unknown no \n", "13 unknown 5 may 71 1 -1 0 unknown no \n", "14 unknown 5 may 174 1 -1 0 unknown no \n", "15 unknown 5 may 353 1 -1 0 unknown no " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Data frame general analysis\n", "df_original.head(16)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 45211 entries, 0 to 45210\n", "Data columns (total 17 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 age 45211 non-null int64 \n", " 1 job 45211 non-null object\n", " 2 marital 45211 non-null object\n", " 3 education 45211 non-null object\n", " 4 default 45211 non-null object\n", " 5 balance 45211 non-null int64 \n", " 6 housing 45211 non-null object\n", " 7 loan 45211 non-null object\n", " 8 contact 45211 non-null object\n", " 9 day 45211 non-null int64 \n", " 10 month 45211 non-null object\n", " 11 duration 45211 non-null int64 \n", " 12 campaign 45211 non-null int64 \n", " 13 pdays 45211 non-null int64 \n", " 14 previous 45211 non-null int64 \n", " 15 poutcome 45211 non-null object\n", " 16 Target 45211 non-null object\n", "dtypes: int64(7), object(10)\n", "memory usage: 5.9+ MB\n" ] } ], "source": [ "# Dataframe information\n", "# Lets analyse data based on following conditions\n", "# 1. Check whether all rows x colums are loaded as given in question, all data must match before we start to even operate on it.\n", "# 2. Print shape of the data\n", "# 8. Check data types of each field\n", "# 3. Find presence of null or missing values.\n", "# 4. Visually inspect data and check presense of Outliers if there are any and see are \n", "# they enough to drop or need to consider during model building\n", "# 5. Print shape of the data\n", "# 6. Do we need to consider all data columns given in data set for model building\n", "# 7. Find Corr, median, mean, std deviation, min, max for columns.\n", "\n", "df_original.info()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We cannot use this raw data set as it is, as it container flelds which are of type object.\n", "This data is usually in the form of string and we should be able to get categories out of this obect type. " ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "age int64\n", "job object\n", "marital object\n", "education object\n", "default object\n", "balance int64\n", "housing object\n", "loan object\n", "contact object\n", "day int64\n", "month object\n", "duration int64\n", "campaign int64\n", "pdays int64\n", "previous int64\n", "poutcome object\n", "Target object\n", "dtype: object" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# data types\n", "\n", "df_original.dtypes\n", "\n", "# also part of info indicating which are int nd object type though redundant." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Check presence of any null values\n", "\n", "df_original.isnull().values.any()\n", "\n", "# This return `False` it mean we do no have any present of null values" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Check presence of missing value\n", " \n", "df_original.isna().values.any()\n", "\n", "# This return `False` it mean we do no have any present of missing values" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(45211, 17)" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Shape of the data\n", "\n", "df_original.shape\n", "\n", "# we have 45211 rows and 17 columns" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
agebalancedaydurationcampaignpdaysprevious
count45211.00000045211.00000045211.00000045211.00000045211.00000045211.00000045211.000000
mean40.9362101362.27205815.806419258.1630802.76384140.1978280.580323
std10.6187623044.7658298.322476257.5278123.098021100.1287462.303441
min18.000000-8019.0000001.0000000.0000001.000000-1.0000000.000000
25%33.00000072.0000008.000000103.0000001.000000-1.0000000.000000
50%39.000000448.00000016.000000180.0000002.000000-1.0000000.000000
75%48.0000001428.00000021.000000319.0000003.000000-1.0000000.000000
max95.000000102127.00000031.0000004918.00000063.000000871.000000275.000000
\n", "
" ], "text/plain": [ " age balance day duration campaign \\\n", "count 45211.000000 45211.000000 45211.000000 45211.000000 45211.000000 \n", "mean 40.936210 1362.272058 15.806419 258.163080 2.763841 \n", "std 10.618762 3044.765829 8.322476 257.527812 3.098021 \n", "min 18.000000 -8019.000000 1.000000 0.000000 1.000000 \n", "25% 33.000000 72.000000 8.000000 103.000000 1.000000 \n", "50% 39.000000 448.000000 16.000000 180.000000 2.000000 \n", "75% 48.000000 1428.000000 21.000000 319.000000 3.000000 \n", "max 95.000000 102127.000000 31.000000 4918.000000 63.000000 \n", "\n", " pdays previous \n", "count 45211.000000 45211.000000 \n", "mean 40.197828 0.580323 \n", "std 100.128746 2.303441 \n", "min -1.000000 0.000000 \n", "25% -1.000000 0.000000 \n", "50% -1.000000 0.000000 \n", "75% -1.000000 0.000000 \n", "max 871.000000 275.000000 " ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Check data loading and analyse data description\n", "\n", "df_original.describe()" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "object 10\n", "int64 7\n", "dtype: int64" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Print Different data types from dataframe and its reference type\n", "\n", "df_original.dtypes.value_counts()\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As we see, only 7 colums have been loaded and rest 10 are missing from here. These seems to be categorical column and hence we need to convert them in numerical columns.\n", "\n", "Before we move on to converting these values into categorical variable lets examine what are these values.\n", "This can be done by checking unique and unique count on that columns.\n" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/markdown": [ "**age** : has unique data in this range [18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 92, 93, 94, 95]" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "**************************************************************************************\n" ] }, { "data": { "text/markdown": [ "**balance** : has unique data in this range [-8019, -6847, -4057, -3372, -3313, -3058, -2827, -2712, -2604, -2282, -2122, -2093, -2082, -2049, -1980, -1968, -1965, -1944, -1941, -1884, -1882, -1854, -1818, -1781, -1779, -1746, -1737, -1730, -1725, -1701, -1680, -1668, -1664, -1661, -1655, -1636, -1629, -1621, -1613, -1601, -1598, -1586, -1547, -1545, -1531, -1500, -1493, -1490, -1489, -1485, -1480, -1459, -1455, -1451, -1445, -1415, -1414, -1400, -1386, -1385, -1379, -1361, -1350, -1336, -1329, -1322, -1317, -1313, -1312, -1310, -1300, -1272, -1270, -1249, -1246, -1232, -1224, -1217, -1212, -1206, -1202, -1196, -1193, -1185, -1176, -1168, -1164, -1161, -1157, -1148, -1139, -1137, -1136, -1129, -1124, -1122, -1112, -1110, -1105, -1099, -1092, -1091, -1089, -1085, -1083, -1080, -1076, -1053, -1050, -1049, -1042, -1041, -1040, -1038, -1036, -1034, -1027, -1026, -1019, -1014, -1013, -1011, -1007, -1006, -1002, -1001, -999, -998, -997, -995, -994, -988, -985, -983, -982, -980, -978, -976, -974, -972, -971, -970, -969, -967, -966, -962, -961, -954, -948, -947, -946, -942, -940, -939, -938, -934, -933, -932, -931, -930, -923, -921, -918, -910, -905, -901, -898, -896, -895, -892, -890, -888, -887, -886, -880, -879, -876, -874, -872, -871, -870, -869, -868, -867, -865, -864, -861, -859, -854, -853, -852, -849, -848, -847, -846, -839, -838, -835, -834, -832, -825, -824, -820, -817, -816, -813, -812, -811, -810, -808, -806, -805, -804, -803, -800, -799, -797, -796, -790, -786, -782, -780, -779, -777, -771, -770, -769, -768, -767, -762, -759, -757, -755, -754, -753, -752, -750, -749, -747, -745, -744, -742, -741, -740, -738, -736, -735, -732, -731, -728, -725, -723, -722, -720, -718, -717, -715, -714, -713, -711, -710, -709, -708, -706, -705, -704, -703, -701, -700, -697, -694, -692, -691, -690, -689, -688, -687, -686, -685, -684, -683, -682, -681, -680, -679, -677, -676, -675, -674, -673, -672, -671, -670, -667, -666, -665, -664, -663, -661, -659, -656, -651, -650, -648, -646, -644, -643, -642, -641, -640, -639, -637, -636, -635, -634, -633, -632, -631, -630, -628, -627, -626, -625, -624, -621, -619, -618, -617, -616, -614, -613, -612, -611, -609, -608, -607, -606, -605, -603, -601, -600, -599, -598, -597, -596, -594, -593, -591, -589, -588, -587, -585, -584, -583, -582, -581, -580, -579, -578, -577, -576, -575, -574, -572, -571, -570, -569, -568, -566, -565, -564, -563, -562, -560, -559, -558, -557, -556, -555, -554, -553, -552, -551, -550, -549, -548, -547, -546, -545, -544, -542, -541, -540, -538, -537, -535, -534, -533, -532, -531, -530, -529, -528, -527, -526, -525, -524, -523, -522, -521, -519, -518, -517, -516, -515, -513, -512, -511, -510, -509, -508, -507, -506, -505, -504, -503, -502, -501, -500, -499, -498, -497, -496, -495, -494, -493, -492, -491, -490, -489, -488, -487, -485, -483, -482, -481, -480, -479, -478, -477, -476, -475, -474, -473, -472, -471, -470, -469, -468, -467, -466, -465, -464, -463, -462, -461, -460, -459, -458, -457, -456, -455, -454, -453, -452, -451, -450, -449, -448, -447, -446, -444, -443, -442, -441, -440, -439, -438, -437, -436, -435, -433, -432, -431, -430, -429, -428, -427, -426, -424, -423, -422, -421, -420, -418, -417, -416, -415, -414, -413, -412, -411, -410, -409, -408, -407, -406, -405, -404, -403, -402, -401, -400, -399, -398, -397, -396, -395, -394, -393, -392, -391, -390, -389, -388, -386, -385, -384, -383, -382, -381, -380, -379, -378, -376, -375, -374, -372, -371, -370, -369, -368, -367, -366, -365, -364, -363, -362, -361, -360, -359, -358, -357, -356, -355, -354, -353, -352, -350, -349, -348, -347, -346, -345, -344, -343, -342, -341, -340, -339, -338, -337, -336, -335, -334, -333, -332, -331, -330, -329, -328, -327, -326, -325, -324, -323, -322, -321, -320, -319, -318, -317, -315, -314, -313, -312, -311, -310, -309, -308, -307, -306, -305, -304, -303, -302, -301, -300, -299, -298, -297, -296, -295, -294, -293, -292, -291, -290, -289, -288, -287, -286, -285, -284, -283, -282, -281, -280, -279, -278, -277, -276, -275, -274, -273, -272, -271, -269, -268, -267, -266, -265, -264, -263, -262, -261, -260, -259, -258, -257, -256, -255, -254, -253, -252, -251, -250, -249, -248, -247, -246, -245, -244, -243, -242, -241, -240, -239, -238, -237, -236, -235, -234, -233, -232, -231, -230, -229, -228, -227, -226, -225, -224, -223, -222, -221, -220, -219, -218, -217, -216, -215, -214, -213, -212, -211, -210, -209, -208, -207, -206, -205, -204, -203, -202, -201, -200, -199, -198, -197, -196, -195, -194, -193, -192, -191, -190, -189, -188, -187, -186, -185, -184, -183, -182, -181, -180, -179, -178, -177, -176, -175, -174, -173, -172, -171, -170, -169, -168, -167, -166, -165, -164, -163, -162, -161, -160, -159, -158, -157, -156, -155, -154, -153, -152, -151, -150, -149, -148, -147, -146, -145, -144, -143, -142, -141, -140, -139, -138, -137, -136, -135, -134, -133, -132, -131, -130, -129, -128, -127, -126, -125, -124, -123, -122, -121, -120, -119, -118, -117, -116, -115, -114, -113, -112, -111, -110, -109, -108, -107, -106, -105, -104, -103, -102, -101, -100, -99, -98, -97, -96, -95, -94, -93, -92, -91, -90, -89, -88, -87, -86, -85, -84, -83, -82, -81, -80, -79, -78, -77, -76, -75, -74, -73, -72, -71, -70, -69, -68, -67, -66, -65, -64, -63, -62, -61, -60, -59, -58, -57, -56, -55, -54, -53, -52, -51, -50, -49, -48, -47, -46, -45, -44, -43, -42, -41, -40, -39, -38, -37, -36, -35, -34, -33, -32, -31, -30, -29, -28, -27, -26, -25, -24, -23, -22, -21, -20, -19, -18, -17, -16, -15, -14, -13, -12, -11, -10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, 500, 501, 502, 503, 504, 505, 506, 507, 508, 509, 510, 511, 512, 513, 514, 515, 516, 517, 518, 519, 520, 521, 522, 523, 524, 525, 526, 527, 528, 529, 530, 531, 532, 533, 534, 535, 536, 537, 538, 539, 540, 541, 542, 543, 544, 545, 546, 547, 548, 549, 550, 551, 552, 553, 554, 555, 556, 557, 558, 559, 560, 561, 562, 563, 564, 565, 566, 567, 568, 569, 570, 571, 572, 573, 574, 575, 576, 577, 578, 579, 580, 581, 582, 583, 584, 585, 586, 587, 588, 589, 590, 591, 592, 593, 594, 595, 596, 597, 598, 599, 600, 601, 602, 603, 604, 605, 606, 607, 608, 609, 610, 611, 612, 613, 614, 615, 616, 617, 618, 619, 620, 621, 622, 623, 624, 625, 626, 627, 628, 629, 630, 631, 632, 633, 634, 635, 636, 637, 638, 639, 640, 641, 642, 643, 644, 645, 646, 647, 648, 649, 650, 651, 652, 653, 654, 655, 656, 657, 658, 659, 660, 661, 662, 663, 664, 665, 666, 667, 668, 669, 670, 671, 672, 673, 674, 675, 676, 677, 678, 679, 680, 681, 682, 683, 684, 685, 686, 687, 688, 689, 690, 691, 692, 693, 694, 695, 696, 697, 698, 699, 700, 701, 702, 703, 704, 705, 706, 707, 708, 709, 710, 711, 712, 713, 714, 715, 716, 717, 718, 719, 720, 721, 722, 723, 724, 725, 726, 727, 728, 729, 730, 731, 732, 733, 734, 735, 736, 737, 738, 739, 740, 741, 742, 743, 744, 745, 746, 747, 748, 749, 750, 751, 752, 753, 754, 755, 756, 757, 758, 759, 760, 761, 762, 763, 764, 765, 766, 767, 768, 769, 770, 771, 772, 773, 774, 775, 776, 777, 778, 779, 780, 781, 782, 783, 784, 785, 786, 787, 788, 789, 790, 791, 792, 793, 794, 795, 796, 797, 798, 799, 800, 801, 802, 803, 804, 805, 806, 807, 808, 809, 810, 811, 812, 813, 814, 815, 816, 817, 818, 819, 820, 821, 822, 823, 824, 825, 826, 827, 828, 829, 830, 831, 832, 833, 834, 835, 836, 837, 838, 839, 840, 841, 842, 843, 844, 845, 846, 847, 848, 849, 850, 851, 852, 853, 854, 855, 856, 857, 858, 859, 860, 861, 862, 863, 864, 865, 866, 867, 868, 869, 870, 871, 872, 873, 874, 875, 876, 877, 878, 879, 880, 881, 882, 883, 884, 885, 886, 887, 888, 889, 890, 891, 892, 893, 894, 895, 896, 897, 898, 899, 900, 901, 902, 903, 904, 905, 906, 907, 908, 909, 910, 911, 912, 913, 914, 915, 916, 917, 918, 919, 920, 921, 922, 923, 924, 925, 926, 927, 928, 929, 930, 931, 932, 933, 934, 935, 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952, 953, 954, 955, 956, 957, 958, 959, 960, 961, 962, 963, 964, 965, 966, 967, 968, 969, 970, 971, 972, 973, 974, 975, 976, 977, 978, 979, 980, 981, 982, 983, 984, 985, 986, 987, 988, 989, 990, 991, 992, 993, 994, 995, 996, 997, 998, 999, 1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, 1011, 1012, 1013, 1014, 1015, 1016, 1017, 1018, 1019, 1020, 1021, 1022, 1023, 1024, 1025, 1026, 1027, 1028, 1029, 1030, 1031, 1032, 1033, 1034, 1035, 1036, 1037, 1038, 1039, 1040, 1041, 1042, 1043, 1044, 1045, 1046, 1047, 1048, 1049, 1050, 1051, 1052, 1053, 1054, 1055, 1056, 1057, 1058, 1059, 1060, 1061, 1062, 1063, 1064, 1065, 1066, 1067, 1068, 1069, 1070, 1071, 1072, 1073, 1074, 1075, 1076, 1077, 1078, 1079, 1080, 1081, 1082, 1083, 1084, 1085, 1086, 1087, 1088, 1089, 1090, 1091, 1092, 1093, 1094, 1095, 1096, 1097, 1098, 1099, 1100, 1101, 1102, 1103, 1104, 1105, 1106, 1107, 1108, 1109, 1110, 1111, 1112, 1113, 1114, 1115, 1116, 1117, 1118, 1119, 1120, 1121, 1122, 1123, 1124, 1125, 1126, 1127, 1128, 1129, 1130, 1131, 1132, 1133, 1134, 1135, 1136, 1137, 1138, 1139, 1140, 1141, 1142, 1143, 1144, 1145, 1146, 1147, 1148, 1149, 1150, 1151, 1152, 1153, 1154, 1155, 1156, 1157, 1158, 1159, 1160, 1161, 1162, 1163, 1164, 1165, 1166, 1167, 1168, 1169, 1170, 1171, 1172, 1173, 1174, 1175, 1176, 1177, 1178, 1179, 1180, 1181, 1182, 1183, 1184, 1185, 1186, 1187, 1188, 1189, 1190, 1191, 1192, 1193, 1194, 1195, 1196, 1197, 1198, 1199, 1200, 1201, 1202, 1203, 1204, 1205, 1206, 1207, 1208, 1209, 1210, 1211, 1212, 1213, 1214, 1215, 1216, 1217, 1218, 1219, 1220, 1221, 1222, 1223, 1224, 1225, 1226, 1227, 1228, 1229, 1230, 1231, 1232, 1233, 1234, 1235, 1236, 1237, 1238, 1239, 1240, 1241, 1242, 1243, 1244, 1245, 1246, 1247, 1248, 1249, 1250, 1251, 1252, 1253, 1254, 1255, 1256, 1257, 1258, 1259, 1260, 1261, 1262, 1263, 1264, 1265, 1266, 1267, 1268, 1269, 1270, 1271, 1272, 1273, 1274, 1275, 1276, 1277, 1278, 1279, 1280, 1281, 1282, 1283, 1284, 1285, 1286, 1287, 1288, 1289, 1290, 1291, 1292, 1293, 1294, 1295, 1296, 1297, 1298, 1299, 1300, 1301, 1302, 1303, 1304, 1305, 1306, 1307, 1308, 1309, 1310, 1311, 1312, 1313, 1314, 1315, 1316, 1317, 1318, 1319, 1320, 1321, 1322, 1323, 1324, 1325, 1326, 1327, 1328, 1329, 1330, 1331, 1332, 1333, 1334, 1335, 1336, 1337, 1338, 1339, 1340, 1341, 1342, 1343, 1344, 1345, 1346, 1347, 1348, 1349, 1350, 1351, 1352, 1353, 1354, 1355, 1356, 1357, 1358, 1359, 1360, 1361, 1362, 1363, 1364, 1365, 1366, 1367, 1368, 1369, 1370, 1371, 1372, 1373, 1374, 1375, 1376, 1377, 1378, 1379, 1380, 1381, 1382, 1383, 1384, 1385, 1386, 1387, 1388, 1389, 1390, 1391, 1392, 1393, 1394, 1395, 1396, 1397, 1398, 1399, 1400, 1401, 1402, 1403, 1404, 1405, 1406, 1407, 1408, 1409, 1410, 1411, 1412, 1413, 1414, 1415, 1416, 1417, 1418, 1419, 1420, 1421, 1422, 1423, 1424, 1425, 1426, 1427, 1428, 1429, 1430, 1431, 1432, 1433, 1434, 1435, 1436, 1437, 1438, 1439, 1440, 1441, 1442, 1443, 1444, 1445, 1446, 1447, 1448, 1449, 1450, 1451, 1452, 1453, 1454, 1455, 1456, 1457, 1458, 1459, 1460, 1461, 1462, 1463, 1464, 1465, 1466, 1467, 1468, 1469, 1470, 1471, 1472, 1473, 1474, 1475, 1476, 1477, 1478, 1479, 1480, 1481, 1482, 1483, 1484, 1485, 1486, 1487, 1489, 1490, 1491, 1492, 1493, 1494, 1495, 1496, 1497, 1498, 1499, 1500, 1501, 1502, 1503, 1504, 1506, 1507, 1508, 1509, 1510, 1511, 1512, 1513, 1514, 1515, 1516, 1517, 1518, 1519, 1520, 1521, 1522, 1523, 1524, 1525, 1526, 1527, 1528, 1529, 1530, 1531, 1532, 1533, 1534, 1535, 1536, 1537, 1538, 1539, 1540, 1541, 1542, 1543, 1544, 1545, 1546, 1547, 1548, 1549, 1550, 1551, 1553, 1554, 1555, 1556, 1557, 1558, 1559, 1560, 1561, 1562, 1563, 1564, 1565, 1566, 1567, 1568, 1569, 1570, 1571, 1572, 1573, 1574, 1575, 1576, 1577, 1578, 1579, 1580, 1581, 1582, 1583, 1584, 1585, 1586, 1587, 1588, 1589, 1590, 1591, 1592, 1593, 1594, 1595, 1596, 1597, 1598, 1599, 1600, 1601, 1602, 1603, 1604, 1605, 1606, 1607, 1608, 1609, 1610, 1611, 1612, 1613, 1614, 1615, 1616, 1617, 1618, 1619, 1620, 1621, 1622, 1623, 1624, 1625, 1626, 1627, 1628, 1629, 1630, 1631, 1632, 1633, 1634, 1635, 1636, 1637, 1638, 1639, 1640, 1641, 1642, 1643, 1644, 1645, 1646, 1647, 1648, 1649, 1650, 1651, 1653, 1654, 1655, 1656, 1657, 1659, 1660, 1661, 1662, 1663, 1664, 1665, 1666, 1667, 1669, 1670, 1671, 1672, 1673, 1674, 1675, 1676, 1677, 1678, 1679, 1680, 1681, 1682, 1683, 1684, 1685, 1686, 1687, 1688, 1689, 1690, 1691, 1692, 1693, 1694, 1695, 1696, 1697, 1698, 1699, 1700, 1701, 1702, 1703, 1704, 1705, 1706, 1707, 1708, 1709, 1710, 1711, 1712, 1713, 1714, 1715, 1716, 1717, 1718, 1719, 1720, 1721, 1722, 1723, 1724, 1725, 1726, 1727, 1728, 1729, 1730, 1731, 1732, 1733, 1734, 1735, 1736, 1737, 1738, 1739, 1740, 1741, 1742, 1743, 1744, 1745, 1746, 1747, 1749, 1750, 1751, 1752, 1753, 1755, 1756, 1757, 1758, 1759, 1760, 1761, 1762, 1763, 1764, 1765, 1766, 1767, 1768, 1769, 1770, 1771, 1772, 1773, 1774, 1775, 1776, 1777, 1778, 1779, 1780, 1781, 1782, 1783, 1784, 1785, 1786, 1787, 1788, 1790, 1791, 1792, 1794, 1795, 1796, 1797, 1798, 1800, 1801, 1802, 1803, 1804, 1805, 1806, 1807, 1808, 1809, 1810, 1811, 1812, 1813, 1814, 1815, 1816, 1817, 1818, 1819, 1820, 1821, 1822, 1823, 1824, 1825, 1826, 1827, 1828, 1830, 1831, 1832, 1833, 1834, 1835, 1836, 1837, 1838, 1839, 1840, 1841, 1842, 1843, 1844, 1845, 1846, 1847, 1848, 1849, 1850, 1851, 1852, 1853, 1854, 1855, 1856, 1857, 1858, 1859, 1860, 1861, 1862, 1863, 1864, 1865, 1866, 1867, 1868, 1869, 1870, 1871, 1872, 1873, 1874, 1875, 1876, 1877, 1878, 1879, 1880, 1881, 1882, 1883, 1884, 1885, 1886, 1887, 1888, 1889, 1890, 1891, 1892, 1893, 1894, 1895, 1896, 1897, 1898, 1899, 1900, 1901, 1902, 1903, 1904, 1905, 1906, 1907, 1908, 1909, 1910, 1911, 1912, 1913, 1914, 1915, 1916, 1917, 1918, 1919, 1920, 1921, 1922, 1923, 1924, 1925, 1926, 1927, 1928, 1929, 1930, 1931, 1932, 1933, 1934, 1935, 1937, 1938, 1939, 1940, 1941, 1942, 1943, 1944, 1945, 1946, 1947, 1948, 1949, 1950, 1951, 1952, 1953, 1954, 1955, 1956, 1957, 1958, 1959, 1960, 1961, 1962, 1963, 1964, 1965, 1966, 1967, 1968, 1969, 1970, 1971, 1972, 1973, 1974, 1975, 1976, 1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 2000, 2001, 2002, 2003, 2004, 2005, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023, 2024, 2025, 2026, 2027, 2030, 2031, 2032, 2033, 2034, 2036, 2037, 2038, 2039, 2040, 2041, 2042, 2043, 2044, 2045, 2046, 2047, 2048, 2050, 2052, 2053, 2054, 2055, 2056, 2057, 2058, 2059, 2060, 2061, 2062, 2063, 2064, 2065, 2066, 2067, 2068, 2069, 2070, 2071, 2072, 2073, 2074, 2076, 2077, 2079, 2080, 2081, 2082, 2083, 2084, 2085, 2086, 2087, 2088, 2089, 2090, 2091, 2092, 2093, 2094, 2095, 2096, 2097, 2098, 2099, 2100, 2101, 2102, 2103, 2104, 2105, 2106, 2107, 2108, 2109, 2110, 2111, 2112, 2113, 2114, 2115, 2116, 2117, 2118, 2119, 2120, 2121, 2122, 2123, 2124, 2125, 2126, 2127, 2128, 2129, 2130, 2131, 2132, 2133, 2134, 2135, 2137, 2138, 2139, 2140, 2141, 2142, 2143, 2144, 2145, 2146, 2147, 2148, 2149, 2150, 2151, 2152, 2153, 2154, 2155, 2156, 2157, 2158, 2159, 2160, 2161, 2162, 2163, 2164, 2165, 2166, 2167, 2168, 2169, 2170, 2171, 2172, 2173, 2174, 2176, 2177, 2178, 2179, 2180, 2182, 2183, 2184, 2185, 2186, 2187, 2188, 2189, 2190, 2191, 2192, 2193, 2194, 2195, 2196, 2197, 2198, 2199, 2200, 2201, 2202, 2203, 2204, 2205, 2206, 2207, 2208, 2209, 2211, 2212, 2213, 2214, 2215, 2216, 2217, 2218, 2219, 2220, 2221, 2222, 2223, 2225, 2226, 2227, 2228, 2229, 2230, 2231, 2232, 2233, 2234, 2235, 2236, 2237, 2238, 2239, 2240, 2242, 2243, 2244, 2245, 2246, 2247, 2248, 2249, 2251, 2252, 2253, 2254, 2255, 2256, 2257, 2258, 2260, 2261, 2262, 2263, 2264, 2265, 2266, 2267, 2268, 2269, 2270, 2271, 2272, 2273, 2275, 2276, 2277, 2278, 2279, 2280, 2281, 2282, 2283, 2284, 2285, 2287, 2288, 2289, 2290, 2291, 2293, 2294, 2295, 2296, 2297, 2298, 2299, 2300, 2301, 2302, 2303, 2304, 2305, 2306, 2307, 2308, 2309, 2310, 2311, 2312, 2313, 2315, 2316, 2317, 2319, 2320, 2321, 2322, 2323, 2324, 2325, 2326, 2327, 2328, 2329, 2330, 2331, 2332, 2333, 2335, 2336, 2337, 2338, 2339, 2340, 2341, 2342, 2343, 2344, 2345, 2346, 2347, 2348, 2349, 2350, 2351, 2352, 2353, 2354, 2355, 2356, 2357, 2358, 2359, 2360, 2361, 2362, 2363, 2364, 2365, 2366, 2367, 2368, 2369, 2370, 2371, 2374, 2376, 2377, 2378, 2380, 2381, 2383, 2384, 2385, 2386, 2387, 2388, 2389, 2390, 2391, 2392, 2394, 2395, 2396, 2397, 2398, 2399, 2400, 2401, 2402, 2403, 2404, 2405, 2406, 2407, 2408, 2409, 2410, 2411, 2412, 2413, 2414, 2415, 2416, 2417, 2418, 2419, 2420, 2421, 2422, 2423, 2424, 2426, 2427, 2428, 2429, 2430, 2431, 2432, 2433, 2434, 2436, 2437, 2439, 2440, 2441, 2442, 2443, 2444, 2445, 2447, 2449, 2450, 2451, 2452, 2453, 2454, 2455, 2456, 2457, 2458, 2459, 2460, 2461, 2463, 2464, 2465, 2466, 2467, 2468, 2469, 2470, 2471, 2472, 2473, 2474, 2475, 2476, 2477, 2478, 2479, 2480, 2481, 2483, 2484, 2485, 2486, 2487, 2488, 2489, 2490, 2491, 2493, 2495, 2496, 2497, 2498, 2499, 2500, 2501, 2502, 2503, 2505, 2506, 2507, 2508, 2509, 2511, 2512, 2514, 2515, 2516, 2517, 2518, 2519, 2520, 2521, 2522, 2523, 2524, 2525, 2526, 2527, 2528, 2529, 2530, 2531, 2532, 2533, 2534, 2535, 2536, 2537, 2538, 2539, 2540, 2541, 2542, 2543, 2544, 2547, 2548, 2549, 2550, 2551, 2552, 2553, 2554, 2555, 2556, 2557, 2558, 2559, 2561, 2562, 2564, 2565, 2567, 2568, 2569, 2570, 2571, 2572, 2573, 2574, 2575, 2576, 2577, 2578, 2579, 2580, 2581, 2582, 2583, 2584, 2585, 2586, 2587, 2589, 2590, 2591, 2592, 2593, 2594, 2595, 2596, 2597, 2598, 2599, 2600, 2601, 2603, 2604, 2605, 2607, 2608, 2609, 2610, 2611, 2612, 2613, 2614, 2615, 2616, 2617, 2618, 2619, 2620, 2621, 2622, 2623, 2625, 2626, 2627, 2628, 2629, 2630, 2631, 2632, 2633, 2635, 2636, 2637, 2639, 2640, 2641, 2642, 2643, 2644, 2645, 2646, 2647, 2648, 2650, 2651, 2652, 2653, 2655, 2656, 2657, 2658, 2660, 2661, 2662, 2663, 2664, 2665, 2666, 2667, 2668, 2669, 2671, 2672, 2673, 2674, 2675, 2676, 2677, 2678, 2679, 2681, 2682, 2683, 2684, 2685, 2686, 2687, 2688, 2689, 2690, 2691, 2692, 2693, 2694, 2695, 2696, 2697, 2699, 2700, 2701, 2702, 2703, 2704, 2705, 2706, 2707, 2708, 2709, 2710, 2711, 2713, 2714, 2715, 2716, 2717, 2718, 2719, 2720, 2722, 2723, 2724, 2725, 2726, 2727, 2728, 2729, 2730, 2731, 2732, 2733, 2734, 2735, 2736, 2737, 2739, 2740, 2741, 2743, 2744, 2745, 2746, 2747, 2749, 2750, 2751, 2752, 2753, 2754, 2755, 2756, 2757, 2758, 2759, 2760, 2761, 2762, 2763, 2764, 2765, 2766, 2767, 2768, 2769, 2770, 2772, 2774, 2775, 2776, 2777, 2779, 2780, 2781, 2782, 2783, 2784, 2785, 2786, 2787, 2788, 2789, 2790, 2791, 2793, 2794, 2795, 2796, 2798, 2799, 2800, 2801, 2802, 2803, 2805, 2806, 2807, 2808, 2809, 2810, 2811, 2812, 2813, 2814, 2815, 2816, 2817, 2818, 2819, 2820, 2821, 2822, 2823, 2825, 2827, 2829, 2830, 2831, 2832, 2833, 2834, 2835, 2836, 2837, 2838, 2840, 2841, 2843, 2845, 2846, 2847, 2848, 2849, 2850, 2851, 2852, 2853, 2854, 2855, 2856, 2857, 2858, 2859, 2860, 2861, 2862, 2863, 2868, 2869, 2870, 2873, 2875, 2876, 2877, 2878, 2879, 2880, 2881, 2882, 2883, 2884, 2885, 2886, 2887, 2889, 2891, 2892, 2893, 2894, 2895, 2896, 2897, 2899, 2900, 2901, 2903, 2904, 2906, 2907, 2908, 2909, 2910, 2911, 2913, 2914, 2915, 2916, 2917, 2918, 2919, 2920, 2921, 2922, 2923, 2924, 2925, 2926, 2927, 2928, 2929, 2931, 2932, 2933, 2934, 2935, 2936, 2937, 2938, 2939, 2940, 2944, 2945, 2946, 2948, 2950, 2951, 2952, 2953, 2954, 2955, 2956, 2957, 2958, 2959, 2960, 2961, 2962, 2963, 2964, 2965, 2967, 2968, 2969, 2970, 2971, 2972, 2974, 2975, 2976, 2977, 2978, 2979, 2980, 2981, 2982, 2983, 2984, 2985, 2986, 2987, 2988, 2990, 2991, 2992, 2993, 2994, 2995, 2996, 2997, 2998, 2999, 3000, 3002, 3003, 3004, 3006, 3007, 3008, 3009, 3012, 3013, 3014, 3015, 3016, 3017, 3018, 3019, 3020, 3021, 3022, 3023, 3024, 3025, 3026, 3027, 3028, 3029, 3030, 3031, 3032, 3033, 3034, 3035, 3036, 3037, 3038, 3039, 3041, 3043, 3044, 3045, 3046, 3047, 3048, 3049, 3050, 3051, 3052, 3053, 3054, 3056, 3057, 3058, 3059, 3060, 3061, 3062, 3063, 3064, 3067, 3068, 3069, 3070, 3071, 3072, 3073, 3074, 3075, 3076, 3079, 3080, 3082, 3083, 3086, 3087, 3090, 3091, 3092, 3094, 3095, 3096, 3097, 3098, 3100, 3102, 3103, 3104, 3105, 3107, 3108, 3109, 3110, 3111, 3112, 3113, 3114, 3115, 3117, 3118, 3119, 3120, 3122, 3123, 3126, 3127, 3129, 3131, 3132, 3133, 3134, 3135, 3136, 3137, 3138, 3139, 3140, 3141, 3142, 3143, 3144, 3145, 3148, 3149, 3150, 3151, 3154, 3155, 3156, 3157, 3158, 3160, 3161, 3163, 3164, 3165, 3166, 3167, 3168, 3169, 3170, 3172, 3173, 3175, 3176, 3177, 3178, 3180, 3181, 3184, 3185, 3186, 3187, 3188, 3189, 3190, 3191, 3192, 3194, 3195, 3196, 3197, 3198, 3199, 3201, 3202, 3203, 3204, 3206, 3207, 3208, 3211, 3213, 3214, 3215, 3216, 3217, 3219, 3220, 3221, 3222, 3224, 3226, 3228, 3229, 3230, 3231, 3232, 3233, 3234, 3236, 3237, 3238, 3239, 3240, 3241, 3242, 3243, 3244, 3246, 3247, 3249, 3250, 3252, 3253, 3254, 3255, 3257, 3258, 3259, 3260, 3261, 3262, 3263, 3264, 3266, 3267, 3268, 3269, 3270, 3271, 3274, 3275, 3276, 3277, 3278, 3279, 3280, 3281, 3282, 3283, 3284, 3285, 3286, 3287, 3288, 3289, 3290, 3291, 3293, 3294, 3295, 3296, 3297, 3298, 3300, 3301, 3302, 3303, 3304, 3305, 3307, 3308, 3309, 3310, 3311, 3313, 3314, 3315, 3316, 3317, 3321, 3322, 3323, 3324, 3326, 3327, 3329, 3330, 3331, 3332, 3333, 3334, 3335, 3337, 3338, 3339, 3340, 3342, 3343, 3344, 3345, 3346, 3347, 3348, 3349, 3350, 3352, 3353, 3354, 3355, 3357, 3358, 3360, 3361, 3362, 3363, 3364, 3366, 3367, 3368, 3369, 3370, 3371, 3372, 3373, 3374, 3376, 3377, 3379, 3381, 3382, 3384, 3386, 3387, 3388, 3390, 3391, 3392, 3394, 3395, 3396, 3397, 3398, 3399, 3400, 3401, 3402, 3403, 3404, 3405, 3406, 3407, 3409, 3410, 3411, 3412, 3413, 3414, 3415, 3417, 3418, 3419, 3420, 3421, 3422, 3423, 3426, 3427, 3428, 3429, 3430, 3431, 3432, 3433, 3434, 3436, 3438, 3440, 3442, 3443, 3444, 3445, 3446, 3450, 3451, 3452, 3455, 3456, 3457, 3458, 3459, 3460, 3461, 3462, 3463, 3465, 3466, 3467, 3468, 3469, 3470, 3471, 3472, 3473, 3478, 3480, 3481, 3485, 3486, 3487, 3490, 3492, 3493, 3494, 3495, 3496, 3498, 3499, 3500, 3501, 3503, 3504, 3505, 3507, 3508, 3510, 3511, 3512, 3514, 3516, 3517, 3518, 3519, 3520, 3524, 3527, 3528, 3529, 3530, 3531, 3532, 3533, 3534, 3536, 3537, 3538, 3540, 3542, 3544, 3545, 3546, 3547, 3549, 3550, 3551, 3552, 3554, 3556, 3557, 3558, 3559, 3560, 3561, 3562, 3563, 3564, 3567, 3568, 3570, 3571, 3572, 3573, 3574, 3575, 3576, 3577, 3578, 3579, 3583, 3584, 3585, 3586, 3587, 3588, 3589, 3590, 3591, 3594, 3595, 3598, 3601, 3603, 3604, 3605, 3608, 3610, 3611, 3612, 3615, 3616, 3620, 3622, 3623, 3624, 3625, 3626, 3628, 3629, 3630, 3632, 3634, 3635, 3636, 3638, 3640, 3641, 3643, 3644, 3646, 3648, 3649, 3651, 3652, 3653, 3654, 3655, 3656, 3657, 3658, 3659, 3662, 3663, 3664, 3665, 3669, 3670, 3671, 3672, 3674, 3675, 3676, 3677, 3679, 3680, 3681, 3684, 3685, 3687, 3688, 3689, 3690, 3693, 3694, 3695, 3696, 3698, 3700, 3701, 3702, 3703, 3704, 3705, 3706, 3708, 3710, 3711, 3713, 3714, 3715, 3717, 3718, 3720, 3721, 3722, 3723, 3724, 3726, 3727, 3728, 3729, 3730, 3732, 3733, 3735, 3736, 3737, 3738, 3739, 3740, 3743, 3744, 3745, 3748, 3749, 3750, 3751, 3752, 3753, 3754, 3756, 3759, 3760, 3761, 3762, 3763, 3764, 3765, 3766, 3767, 3768, 3769, 3770, 3771, 3773, 3774, 3776, 3777, 3778, 3779, 3780, 3782, 3783, 3784, 3786, 3790, 3791, 3792, 3794, 3795, 3796, 3797, 3798, 3800, 3803, 3805, 3806, 3809, 3810, 3812, 3813, 3814, 3815, 3816, 3817, 3818, 3819, 3820, 3821, 3823, 3824, 3825, 3827, 3829, 3831, 3832, 3834, 3837, 3839, 3840, 3841, 3842, 3843, 3844, 3845, 3846, 3848, 3849, 3850, 3851, 3854, 3855, 3856, 3857, 3858, 3859, 3862, 3863, 3864, 3867, 3868, 3869, 3870, 3872, 3873, 3874, 3875, 3876, 3877, 3881, 3884, 3885, 3886, 3888, 3889, 3895, 3897, 3899, 3902, 3904, 3905, 3908, 3909, 3910, 3911, 3912, 3913, 3914, 3915, 3916, 3917, 3918, 3919, 3921, 3923, 3924, 3926, 3927, 3929, 3931, 3932, 3933, 3935, 3936, 3938, 3939, 3940, 3941, 3942, 3943, 3944, 3945, 3947, 3948, 3949, 3950, 3951, 3952, 3953, 3954, 3955, 3957, 3959, 3960, 3962, 3965, 3967, 3969, 3970, 3972, 3973, 3975, 3977, 3981, 3982, 3984, 3986, 3988, 3990, 3992, 3993, 3994, 3995, 3997, 3998, 3999, 4000, 4003, 4004, 4005, 4006, 4007, 4009, 4011, 4012, 4013, 4014, 4015, 4016, 4017, 4020, 4022, 4023, 4024, 4025, 4028, 4030, 4031, 4037, 4038, 4039, 4040, 4041, 4043, 4044, 4045, 4046, 4047, 4048, 4050, 4053, 4054, 4056, 4060, 4062, 4063, 4064, 4066, 4068, 4069, 4070, 4071, 4073, 4075, 4079, 4080, 4082, 4083, 4084, 4086, 4087, 4089, 4091, 4092, 4094, 4095, 4096, 4099, 4101, 4103, 4104, 4105, 4108, 4110, 4111, 4112, 4116, 4117, 4118, 4119, 4120, 4121, 4123, 4124, 4126, 4127, 4128, 4129, 4130, 4131, 4132, 4133, 4134, 4135, 4136, 4137, 4138, 4139, 4140, 4143, 4144, 4145, 4146, 4147, 4148, 4149, 4150, 4151, 4152, 4153, 4157, 4158, 4162, 4166, 4168, 4170, 4173, 4174, 4176, 4177, 4178, 4182, 4185, 4186, 4189, 4190, 4191, 4194, 4196, 4198, 4200, 4204, 4207, 4209, 4210, 4211, 4213, 4216, 4222, 4223, 4227, 4229, 4230, 4231, 4232, 4233, 4235, 4236, 4239, 4240, 4243, 4244, 4246, 4247, 4248, 4253, 4254, 4256, 4259, 4260, 4262, 4263, 4264, 4265, 4266, 4269, 4274, 4276, 4278, 4279, 4280, 4281, 4283, 4286, 4287, 4289, 4290, 4291, 4293, 4294, 4295, 4297, 4298, 4299, 4300, 4301, 4303, 4305, 4306, 4307, 4309, 4311, 4312, 4313, 4314, 4315, 4317, 4318, 4319, 4320, 4321, 4322, 4323, 4324, 4325, 4328, 4329, 4330, 4331, 4332, 4333, 4335, 4336, 4339, 4341, 4343, 4344, 4348, 4353, 4354, 4357, 4358, 4359, 4362, 4365, 4366, 4367, 4369, 4370, 4372, 4373, 4374, 4378, 4380, 4381, 4382, 4383, 4384, 4385, 4386, 4387, 4388, 4389, 4391, 4392, 4393, 4394, 4395, 4396, 4397, 4399, 4401, 4402, 4403, 4404, 4406, 4408, 4409, 4411, 4412, 4413, 4414, 4415, 4416, 4418, 4420, 4424, 4425, 4428, 4430, 4432, 4434, 4436, 4438, 4439, 4440, 4441, 4442, 4443, 4444, 4445, 4446, 4447, 4448, 4450, 4451, 4453, 4455, 4457, 4458, 4459, 4460, 4461, 4463, 4464, 4465, 4466, 4468, 4471, 4475, 4477, 4478, 4480, 4481, 4482, 4487, 4488, 4490, 4492, 4493, 4495, 4497, 4499, 4500, 4503, 4504, 4505, 4508, 4509, 4512, 4513, 4515, 4517, 4519, 4520, 4522, 4527, 4531, 4533, 4535, 4536, 4537, 4539, 4541, 4542, 4543, 4544, 4545, 4547, 4554, 4556, 4557, 4558, 4561, 4562, 4564, 4565, 4567, 4568, 4570, 4572, 4574, 4575, 4576, 4577, 4578, 4579, 4580, 4581, 4582, 4583, 4585, 4586, 4587, 4588, 4589, 4590, 4591, 4592, 4593, 4594, 4596, 4597, 4599, 4601, 4602, 4605, 4606, 4608, 4610, 4612, 4613, 4617, 4619, 4622, 4623, 4629, 4630, 4634, 4635, 4636, 4638, 4639, 4641, 4642, 4644, 4645, 4646, 4647, 4648, 4649, 4654, 4655, 4656, 4657, 4659, 4660, 4661, 4664, 4665, 4666, 4667, 4674, 4675, 4676, 4680, 4681, 4683, 4684, 4687, 4688, 4692, 4693, 4694, 4695, 4696, 4697, 4698, 4700, 4707, 4708, 4709, 4711, 4712, 4713, 4714, 4716, 4717, 4718, 4719, 4720, 4721, 4722, 4723, 4725, 4726, 4727, 4728, 4731, 4733, 4736, 4737, 4738, 4741, 4744, 4745, 4746, 4749, 4751, 4752, 4758, 4760, 4761, 4763, 4764, 4765, 4769, 4770, 4771, 4772, 4775, 4777, 4778, 4780, 4782, 4785, 4786, 4787, 4788, 4789, 4790, 4791, 4792, 4793, 4795, 4798, 4800, 4803, 4805, 4807, 4808, 4809, 4816, 4819, 4820, 4822, 4824, 4826, 4829, 4830, 4831, 4833, 4835, 4837, 4840, 4841, 4842, 4843, 4844, 4845, 4846, 4847, 4848, 4850, 4852, 4853, 4855, 4856, 4859, 4860, 4861, 4867, 4869, 4871, 4872, 4873, 4874, 4877, 4879, 4882, 4885, 4886, 4887, 4888, 4889, 4894, 4895, 4896, 4897, 4899, 4900, 4902, 4903, 4904, 4906, 4908, 4909, 4910, 4912, 4914, 4917, 4920, 4922, 4923, 4924, 4925, 4928, 4929, 4930, 4932, 4937, 4941, 4943, 4945, 4948, 4949, 4951, 4953, 4954, 4956, 4958, 4959, 4960, 4961, 4962, 4963, 4965, 4967, 4968, 4969, 4971, 4974, 4976, 4978, 4979, 4982, 4984, 4985, 4986, 4987, 4990, 4991, 4994, 4996, 4997, 4998, 5000, 5003, 5004, 5005, 5006, 5007, 5008, 5009, 5010, 5011, 5012, 5016, 5021, 5024, 5028, 5029, 5034, 5037, 5038, 5039, 5041, 5043, 5045, 5047, 5048, 5049, 5050, 5052, 5057, 5058, 5059, 5060, 5061, 5063, 5064, 5065, 5068, 5073, 5074, 5075, 5076, 5078, 5082, 5083, 5084, 5086, 5087, 5089, 5090, 5091, 5092, 5094, 5095, 5098, 5102, 5106, 5108, 5109, 5110, 5112, 5114, 5115, 5116, 5118, 5119, 5122, 5124, 5126, 5127, 5129, 5130, 5131, 5132, 5133, 5137, 5142, 5143, 5145, 5149, 5151, 5152, 5154, 5156, 5157, 5163, 5164, 5167, 5169, 5171, 5172, 5173, 5176, 5177, 5181, 5187, 5188, 5191, 5193, 5195, 5196, 5201, 5204, 5205, 5206, 5207, 5210, 5214, 5215, 5219, 5220, 5222, 5223, 5231, 5233, 5234, 5235, 5236, 5238, 5239, 5241, 5244, 5248, 5249, 5250, 5251, 5252, 5253, 5254, 5260, 5261, 5262, 5265, 5266, 5267, 5270, 5271, 5275, 5276, 5278, 5279, 5282, 5284, 5287, 5288, 5289, 5291, 5293, 5296, 5299, 5301, 5303, 5304, 5305, 5306, 5310, 5312, 5313, 5314, 5315, 5317, 5320, 5326, 5329, 5331, 5334, 5336, 5338, 5340, 5341, 5342, 5343, 5345, 5346, 5347, 5348, 5349, 5350, 5354, 5355, 5356, 5359, 5361, 5365, 5366, 5367, 5368, 5372, 5380, 5381, 5389, 5393, 5396, 5397, 5399, 5401, 5403, 5406, 5414, 5417, 5418, 5420, 5421, 5423, 5426, 5427, 5431, 5432, 5435, 5436, 5437, 5441, 5442, 5443, 5445, 5447, 5450, 5452, 5455, 5461, 5462, 5464, 5467, 5473, 5474, 5475, 5482, 5483, 5486, 5491, 5495, 5496, 5498, 5499, 5501, 5505, 5506, 5511, 5514, 5517, 5521, 5523, 5527, 5533, 5535, 5539, 5542, 5543, 5547, 5548, 5549, 5550, 5551, 5553, 5558, 5559, 5560, 5561, 5562, 5563, 5564, 5567, 5569, 5571, 5574, 5581, 5583, 5584, 5585, 5597, 5600, 5601, 5603, 5605, 5607, 5608, 5610, 5611, 5613, 5614, 5618, 5619, 5624, 5626, 5631, 5632, 5637, 5639, 5640, 5641, 5643, 5657, 5658, 5665, 5666, 5669, 5670, 5673, 5674, 5678, 5679, 5680, 5681, 5689, 5691, 5694, 5695, 5698, 5699, 5700, 5701, 5704, 5705, 5706, 5709, 5711, 5715, 5717, 5722, 5724, 5728, 5729, 5731, 5733, 5735, 5736, 5737, 5739, 5741, 5742, 5744, 5745, 5746, 5749, 5754, 5757, 5758, 5763, 5766, 5768, 5769, 5773, 5774, 5776, 5779, 5780, 5781, 5784, 5787, 5788, 5789, 5792, 5794, 5795, 5797, 5798, 5799, 5801, 5802, 5803, 5804, 5805, 5806, 5807, 5809, 5810, 5811, 5816, 5818, 5827, 5828, 5829, 5830, 5833, 5836, 5837, 5838, 5839, 5845, 5847, 5848, 5850, 5853, 5854, 5861, 5862, 5865, 5871, 5872, 5873, 5874, 5878, 5879, 5880, 5883, 5885, 5887, 5888, 5889, 5891, 5894, 5901, 5902, 5903, 5904, 5907, 5908, 5909, 5910, 5914, 5916, 5918, 5920, 5926, 5927, 5934, 5935, 5936, 5942, 5943, 5944, 5945, 5946, 5949, 5953, 5956, 5957, 5958, 5961, 5964, 5966, 5969, 5972, 5973, 5976, 5978, 5980, 5988, 5990, 5993, 5996, 5997, 5999, 6000, 6004, 6005, 6008, 6010, 6012, 6013, 6014, 6016, 6020, 6025, 6027, 6029, 6030, 6036, 6042, 6043, 6046, 6050, 6053, 6056, 6059, 6060, 6072, 6077, 6080, 6085, 6086, 6089, 6091, 6095, 6100, 6101, 6102, 6106, 6107, 6108, 6110, 6112, 6114, 6116, 6127, 6132, 6134, 6138, 6141, 6144, 6145, 6150, 6153, 6157, 6158, 6162, 6164, 6170, 6171, 6172, 6174, 6178, 6181, 6182, 6187, 6188, 6191, 6196, 6199, 6200, 6203, 6204, 6205, 6207, 6209, 6212, 6215, 6217, 6220, 6227, 6236, 6237, 6242, 6246, 6248, 6250, 6251, 6253, 6258, 6259, 6264, 6269, 6271, 6274, 6279, 6280, 6281, 6284, 6285, 6286, 6288, 6290, 6298, 6307, 6312, 6313, 6316, 6317, 6320, 6321, 6322, 6324, 6325, 6332, 6333, 6337, 6343, 6346, 6350, 6352, 6353, 6359, 6360, 6362, 6368, 6374, 6376, 6383, 6384, 6385, 6386, 6388, 6392, 6393, 6397, 6400, 6402, 6403, 6404, 6408, 6410, 6411, 6413, 6419, 6421, 6422, 6424, 6427, 6429, 6432, 6438, 6439, 6445, 6447, 6449, 6450, 6451, 6463, 6468, 6471, 6475, 6481, 6482, 6483, 6485, 6486, 6489, 6490, 6491, 6492, 6493, 6495, 6497, 6503, 6507, 6510, 6512, 6513, 6519, 6525, 6526, 6529, 6530, 6531, 6532, 6533, 6535, 6536, 6538, 6542, 6551, 6553, 6554, 6557, 6562, 6563, 6567, 6570, 6571, 6572, 6573, 6574, 6575, 6586, 6590, 6596, 6610, 6613, 6614, 6618, 6619, 6622, 6627, 6637, 6641, 6649, 6651, 6657, 6659, 6662, 6671, 6673, 6677, 6683, 6684, 6687, 6690, 6691, 6695, 6699, 6700, 6703, 6704, 6706, 6707, 6710, 6711, 6714, 6718, 6728, 6737, 6739, 6742, 6743, 6745, 6746, 6748, 6753, 6754, 6758, 6762, 6763, 6766, 6767, 6770, 6771, 6781, 6784, 6791, 6797, 6798, 6807, 6808, 6809, 6812, 6815, 6817, 6822, 6825, 6831, 6835, 6836, 6837, 6839, 6840, 6843, 6844, 6850, 6851, 6857, 6859, 6873, 6874, 6878, 6879, 6880, 6882, 6888, 6889, 6890, 6900, 6904, 6909, 6911, 6913, 6915, 6920, 6921, 6922, 6929, 6933, 6943, 6945, 6947, 6955, 6958, 6961, 6963, 6964, 6968, 6969, 6971, 6975, 6979, 6980, 6981, 6982, 6983, 6991, 6993, 6994, 6997, 6998, 6999, 7003, 7005, 7007, 7010, 7019, 7020, 7025, 7027, 7028, 7032, 7038, 7041, 7048, 7049, 7050, 7051, 7063, 7066, 7067, 7073, 7080, 7082, 7084, 7089, 7098, 7100, 7102, 7103, 7104, 7105, 7107, 7108, 7111, 7114, 7118, 7119, 7123, 7124, 7132, 7134, 7135, 7137, 7138, 7154, 7159, 7162, 7171, 7175, 7177, 7179, 7180, 7190, 7192, 7195, 7197, 7198, 7203, 7208, 7216, 7218, 7222, 7225, 7242, 7245, 7249, 7252, 7254, 7255, 7263, 7264, 7265, 7279, 7289, 7290, 7296, 7298, 7303, 7304, 7313, 7317, 7318, 7331, 7336, 7337, 7338, 7343, 7345, 7351, 7353, 7357, 7365, 7369, 7372, 7373, 7378, 7384, 7386, 7387, 7388, 7396, 7400, 7406, 7408, 7419, 7424, 7426, 7429, 7433, 7434, 7440, 7441, 7443, 7444, 7449, 7451, 7454, 7458, 7459, 7464, 7468, 7469, 7475, 7482, 7485, 7487, 7495, 7496, 7503, 7505, 7506, 7509, 7513, 7518, 7522, 7529, 7530, 7531, 7541, 7544, 7546, 7547, 7554, 7558, 7560, 7561, 7567, 7578, 7579, 7585, 7586, 7595, 7598, 7601, 7602, 7604, 7606, 7608, 7612, 7613, 7620, 7621, 7622, 7623, 7624, 7628, 7631, 7633, 7634, 7635, 7641, 7649, 7653, 7668, 7673, 7685, 7687, 7695, 7696, 7699, 7702, 7704, 7707, 7708, 7711, 7712, 7717, 7724, 7727, 7735, 7741, 7742, 7743, 7744, 7747, 7752, 7766, 7773, 7780, 7781, 7783, 7785, 7788, 7791, 7798, 7800, 7802, 7803, 7811, 7813, 7816, 7818, 7819, 7821, 7822, 7825, 7826, 7831, 7832, 7836, 7838, 7845, 7848, 7858, 7861, 7863, 7864, 7867, 7870, 7876, 7879, 7880, 7881, 7895, 7900, 7907, 7918, 7928, 7929, 7934, 7935, 7938, 7944, 7945, 7951, 7957, 7962, 7968, 7973, 7974, 7982, 7984, 7985, 7990, 8000, 8001, 8004, 8014, 8015, 8016, 8017, 8023, 8025, 8029, 8032, 8036, 8037, 8038, 8040, 8043, 8044, 8047, 8066, 8077, 8079, 8089, 8093, 8094, 8097, 8101, 8103, 8104, 8106, 8107, 8112, 8114, 8118, 8119, 8121, 8127, 8131, 8132, 8135, 8136, 8138, 8139, 8141, 8142, 8148, 8150, 8152, 8153, 8163, 8165, 8167, 8180, 8195, 8205, 8206, 8218, 8224, 8226, 8229, 8230, 8238, 8251, 8254, 8259, 8262, 8263, 8265, 8266, 8267, 8277, 8278, 8279, 8289, 8291, 8295, 8298, 8300, 8303, 8304, 8309, 8311, 8312, 8313, 8318, 8319, 8326, 8332, 8334, 8339, 8345, 8351, 8366, 8368, 8379, 8381, 8399, 8402, 8403, 8408, 8412, 8413, 8417, 8422, 8434, 8436, 8444, 8449, 8460, 8463, 8465, 8480, 8486, 8489, 8491, 8494, 8509, 8514, 8515, 8535, 8536, 8538, 8541, 8545, 8548, 8553, 8554, 8556, 8558, 8562, 8563, 8564, 8583, 8585, 8590, 8592, 8594, 8600, 8603, 8605, 8619, 8623, 8626, 8627, 8629, 8637, 8647, 8648, 8649, 8652, 8654, 8666, 8669, 8674, 8680, 8689, 8692, 8717, 8725, 8729, 8730, 8741, 8749, 8750, 8758, 8781, 8784, 8785, 8794, 8806, 8821, 8823, 8826, 8828, 8837, 8839, 8844, 8859, 8860, 8863, 8866, 8873, 8874, 8876, 8883, 8892, 8894, 8897, 8899, 8902, 8903, 8918, 8919, 8929, 8938, 8947, 8950, 8953, 8957, 8963, 8969, 8971, 8973, 8979, 8982, 8990, 9001, 9002, 9004, 9009, 9019, 9039, 9047, 9049, 9051, 9059, 9064, 9068, 9072, 9077, 9083, 9102, 9103, 9110, 9121, 9131, 9135, 9139, 9143, 9146, 9149, 9154, 9160, 9173, 9192, 9194, 9200, 9202, 9207, 9214, 9216, 9221, 9224, 9228, 9231, 9246, 9252, 9261, 9262, 9269, 9277, 9299, 9301, 9304, 9305, 9306, 9311, 9314, 9317, 9319, 9324, 9326, 9328, 9329, 9336, 9339, 9346, 9347, 9359, 9366, 9367, 9374, 9397, 9405, 9407, 9421, 9444, 9447, 9449, 9480, 9506, 9531, 9541, 9569, 9579, 9585, 9601, 9608, 9609, 9610, 9618, 9622, 9629, 9630, 9634, 9636, 9645, 9664, 9669, 9670, 9676, 9678, 9680, 9683, 9687, 9689, 9698, 9710, 9711, 9713, 9714, 9716, 9720, 9725, 9750, 9756, 9767, 9774, 9779, 9796, 9804, 9824, 9827, 9851, 9864, 9874, 9881, 9883, 9894, 9895, 9898, 9902, 9911, 9916, 9935, 9956, 9962, 9965, 9976, 9994, 9997, 10005, 10021, 10031, 10035, 10041, 10045, 10052, 10054, 10065, 10072, 10077, 10086, 10088, 10112, 10114, 10122, 10133, 10142, 10150, 10152, 10171, 10177, 10179, 10180, 10183, 10185, 10189, 10191, 10200, 10215, 10218, 10236, 10250, 10252, 10253, 10269, 10273, 10281, 10285, 10287, 10332, 10333, 10346, 10347, 10350, 10354, 10357, 10360, 10373, 10374, 10378, 10386, 10394, 10395, 10399, 10406, 10436, 10438, 10442, 10443, 10451, 10465, 10469, 10483, 10500, 10532, 10536, 10541, 10558, 10561, 10576, 10583, 10596, 10600, 10613, 10621, 10628, 10635, 10638, 10639, 10653, 10655, 10662, 10667, 10685, 10697, 10721, 10724, 10735, 10749, 10757, 10758, 10768, 10772, 10773, 10776, 10786, 10787, 10788, 10834, 10855, 10861, 10865, 10884, 10888, 10889, 10890, 10905, 10907, 10910, 10924, 10925, 10943, 10950, 10957, 10971, 10984, 10995, 11008, 11016, 11066, 11084, 11093, 11103, 11115, 11126, 11146, 11149, 11174, 11177, 11219, 11222, 11240, 11246, 11254, 11262, 11265, 11269, 11278, 11281, 11285, 11287, 11298, 11303, 11305, 11310, 11315, 11317, 11350, 11371, 11385, 11386, 11387, 11391, 11417, 11423, 11431, 11462, 11464, 11494, 11512, 11516, 11524, 11528, 11532, 11555, 11563, 11591, 11615, 11632, 11639, 11650, 11655, 11671, 11675, 11686, 11697, 11743, 11752, 11754, 11757, 11766, 11767, 11787, 11797, 11804, 11821, 11835, 11839, 11854, 11862, 11887, 11891, 11904, 11953, 11968, 11971, 11972, 11998, 12001, 12018, 12026, 12039, 12048, 12061, 12067, 12114, 12130, 12132, 12159, 12177, 12180, 12186, 12198, 12210, 12212, 12223, 12226, 12245, 12256, 12264, 12269, 12270, 12276, 12282, 12322, 12356, 12389, 12392, 12401, 12409, 12437, 12438, 12482, 12495, 12519, 12531, 12539, 12569, 12581, 12584, 12607, 12618, 12634, 12675, 12686, 12697, 12704, 12705, 12731, 12737, 12766, 12767, 12839, 12845, 12848, 12855, 12857, 12877, 12917, 12926, 12928, 12939, 12956, 12961, 12972, 12980, 13014, 13015, 13021, 13044, 13052, 13054, 13089, 13094, 13099, 13107, 13117, 13118, 13156, 13160, 13164, 13165, 13186, 13189, 13204, 13229, 13238, 13242, 13265, 13292, 13297, 13308, 13315, 13338, 13342, 13354, 13360, 13408, 13410, 13450, 13460, 13489, 13494, 13501, 13546, 13551, 13562, 13565, 13578, 13620, 13654, 13658, 13669, 13683, 13698, 13711, 13718, 13761, 13763, 13774, 13818, 13834, 13836, 13849, 13851, 13853, 13874, 13887, 13893, 13901, 13930, 13931, 14000, 14004, 14054, 14058, 14093, 14107, 14144, 14148, 14170, 14190, 14204, 14215, 14220, 14232, 14282, 14344, 14352, 14355, 14363, 14387, 14412, 14440, 14462, 14481, 14522, 14530, 14533, 14611, 14646, 14657, 14679, 14692, 14752, 14850, 14889, 14902, 14922, 14930, 14968, 15030, 15035, 15062, 15088, 15120, 15161, 15169, 15187, 15261, 15265, 15298, 15302, 15311, 15341, 15352, 15423, 15437, 15442, 15445, 15449, 15459, 15474, 15477, 15485, 15511, 15515, 15520, 15561, 15578, 15681, 15740, 15787, 15801, 15834, 15841, 15904, 16063, 16119, 16125, 16173, 16178, 16232, 16236, 16264, 16353, 16358, 16377, 16397, 16402, 16430, 16431, 16432, 16486, 16489, 16517, 16563, 16649, 16727, 16786, 16843, 16869, 16873, 16874, 16917, 16922, 16935, 16957, 16992, 17008, 17023, 17036, 17056, 17092, 17118, 17206, 17297, 17332, 17335, 17339, 17361, 17410, 17413, 17418, 17432, 17441, 17455, 17458, 17555, 17609, 17655, 17672, 17739, 17747, 17769, 17875, 17891, 17924, 17946, 17957, 17964, 17983, 18016, 18111, 18188, 18254, 18268, 18347, 18508, 18558, 18722, 18777, 18881, 18904, 18931, 18967, 19102, 19213, 19268, 19313, 19317, 19343, 19348, 19358, 19391, 19447, 19690, 19706, 19796, 19797, 19833, 19850, 19985, 20011, 20138, 20179, 20187, 20261, 20422, 20451, 20453, 20479, 20527, 20541, 20580, 20584, 20585, 20600, 20718, 20723, 20727, 20772, 20794, 20798, 20806, 20928, 20932, 21024, 21088, 21096, 21111, 21244, 21292, 21446, 21510, 21515, 21522, 21574, 21614, 21664, 21854, 21861, 21963, 22008, 22018, 22086, 22125, 22171, 22196, 22341, 22370, 22452, 22520, 22546, 22557, 22569, 22755, 22815, 22856, 22867, 22928, 22946, 23047, 23076, 23189, 23421, 23494, 23495, 23552, 23592, 23663, 23867, 23876, 23878, 23917, 23919, 24025, 24055, 24277, 24299, 24312, 24450, 24498, 24556, 24598, 24780, 24870, 25204, 25290, 25741, 25824, 25856, 25947, 26172, 26233, 26254, 26306, 26394, 26452, 26575, 26721, 26765, 26831, 26965, 27069, 27359, 27446, 27624, 27696, 27733, 28318, 28433, 29050, 29080, 29125, 29184, 29207, 29230, 29312, 29340, 29397, 29484, 29887, 29941, 31472, 31630, 31868, 32464, 32685, 32948, 34230, 34247, 34646, 35368, 35589, 36221, 36252, 36686, 36935, 37127, 37176, 37378, 38279, 39098, 39385, 41242, 41630, 41923, 42045, 43074, 44128, 44134, 45141, 45248, 45789, 51439, 52499, 52527, 52587, 56831, 57435, 58544, 58932, 59649, 64343, 66653, 66721, 71188, 81204, 98417, 102127]" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "**************************************************************************************\n" ] }, { "data": { "text/markdown": [ "**day** : has unique data in this range [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "**************************************************************************************\n" ] }, { "data": { "text/markdown": [ "**duration** : has unique data in this rangetext/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "**************************************************************************************\n" ] }, { "data": { "text/markdown": [ "**campaign** : has unique data in this range [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 41, 43, 44, 46, 50, 51, 55, 58, 63]" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "**************************************************************************************\n" ] }, { "data": { "text/markdown": [ "**pdays** : has unique data in this range [-1, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15, 17, 18, 19, 20, 21, 22, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 401, 403, 404, 405, 407, 409, 410, 411, 412, 413, 414, 415, 416, 417, 419, 420, 421, 422, 424, 425, 426, 427, 428, 430, 431, 432, 433, 434, 435, 436, 437, 439, 440, 442, 444, 445, 446, 449, 450, 452, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 469, 470, 472, 474, 475, 476, 477, 478, 479, 480, 481, 484, 485, 486, 489, 490, 491, 492, 493, 495, 500, 503, 504, 508, 511, 514, 515, 518, 520, 521, 524, 526, 528, 529, 530, 531, 532, 535, 536, 541, 542, 543, 544, 547, 550, 551, 553, 555, 557, 558, 561, 562, 578, 579, 585, 586, 587, 589, 592, 594, 595, 603, 616, 626, 633, 648, 651, 655, 656, 667, 670, 674, 680, 683, 686, 687, 690, 701, 717, 728, 745, 749, 756, 760, 761, 769, 771, 772, 774, 775, 776, 778, 779, 782, 784, 791, 792, 804, 805, 808, 826, 828, 831, 838, 842, 850, 854, 871]" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "**************************************************************************************\n" ] }, { "data": { "text/markdown": [ "**previous** : has unique data in this range [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 32, 35, 37, 38, 40, 41, 51, 55, 58, 275]" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "**************************************************************************************\n" ] } ], "source": [ "# Print unique for each column\n", "\n", "for name in df_original_columns: \n", " if df_original[name].dtype == np.int64:\n", " #Sorting for better understanding\n", " sortedCategories =sorted(df_original[name].unique().tolist())\n", " \n", " formattedText = \"has unique data in this range {}\".format(sortedCategories)\n", " \n", " printTextAsMarkdown(name, formattedText, color=\"red\")\n", " print(\"\\n**************************************************************************************\")" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/markdown": [ "**job** : has unique data in this range ['admin.', 'blue-collar', 'entrepreneur', 'housemaid', 'management', 'retired', 'self-employed', 'services', 'student', 'technician', 'unemployed', 'unknown']" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "**************************************************************************************\n" ] }, { "data": { "text/markdown": [ "**marital** : has unique data in this range ['divorced', 'married', 'single']" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "**************************************************************************************\n" ] }, { "data": { "text/markdown": [ "**education** : has unique data in this range ['primary', 'secondary', 'tertiary', 'unknown']" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "**************************************************************************************\n" ] }, { "data": { "text/markdown": [ "**default** : has unique data in this range ['no', 'yes']" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "**************************************************************************************\n" ] }, { "data": { "text/markdown": [ "**housing** : has unique data in this range ['no', 'yes']" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "**************************************************************************************\n" ] }, { "data": { "text/markdown": [ "**loan** : has unique data in this range ['no', 'yes']" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "**************************************************************************************\n" ] }, { "data": { "text/markdown": [ "**contact** : has unique data in this range ['cellular', 'telephone', 'unknown']" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "**************************************************************************************\n" ] }, { "data": { "text/markdown": [ "**month** : has unique data in this range ['apr', 'aug', 'dec', 'feb', 'jan', 'jul', 'jun', 'mar', 'may', 'nov', 'oct', 'sep']" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "**************************************************************************************\n" ] }, { "data": { "text/markdown": [ "**poutcome** : has unique data in this range ['failure', 'other', 'success', 'unknown']" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "**************************************************************************************\n" ] }, { "data": { "text/markdown": [ "**Target** : has unique data in this range ['no', 'yes']" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "**************************************************************************************\n" ] } ], "source": [ "# We are most interested in qunie values of object data column, so lets filter out only object data type\n", "# Priting ${df[name]} and its unique values\n", "\n", "# Container for object column type, later on while label encoding we need to convert only those column which are of type object\n", "objectColumns = []\n", "\n", "for name in df_original_columns: \n", " if df_original[name].dtype == np.object:\n", " \n", " #Sorting for better understanding\n", " sortedCategories =sorted(df_original[name].unique().tolist())\n", " \n", " formattedText = \"has unique data in this range {}\".format(sortedCategories)\n", " \n", " printTextAsMarkdown(name, formattedText, color=\"red\")\n", " \n", " objectColumns.append(name)\n", " print(\"\\n**************************************************************************************\")\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Let us examine this object data type\n", "\n", "- Data spread is very minimal mean less categories\n", "- There is some presense of invalid data i.e `unknown` in job, education, contact, poutcome(unknown) here indicated we don't know whether we have failed or success response from this person." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "# Copy for original dataframe\n", "\n", "df_main = df_original.copy()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### LabelEncoder and Caching" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "\n", "\n", "from sklearn import preprocessing\n", "\n", "# Create empty of of label encoders for different columns\n", "# i wish to save each encoder corresponding to different colum title\n", "\n", "columnEncoders = {}\n", "\n", "for name in objectColumns:\n", " le = preprocessing.LabelEncoder()\n", " # Fit encoder to pandas column\n", " le.fit(df_main[name])\n", " # apply transformation and assign it to df\n", " df_main[name] = le.transform(df_main[name]) \n", " #put name and encoder in map \n", " columnEncoders[name] = le\n" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
agejobmaritaleducationdefaultbalancehousingloancontactdaymonthdurationcampaignpdayspreviouspoutcomeTarget
05841202143102582611-1030
144921029102581511-1030
2332110211258761-1030
3471130150610258921-1030
433112301002581981-1030
\n", "
" ], "text/plain": [ " age job marital education default balance housing loan contact \\\n", "0 58 4 1 2 0 2143 1 0 2 \n", "1 44 9 2 1 0 29 1 0 2 \n", "2 33 2 1 1 0 2 1 1 2 \n", "3 47 1 1 3 0 1506 1 0 2 \n", "4 33 11 2 3 0 1 0 0 2 \n", "\n", " day month duration campaign pdays previous poutcome Target \n", "0 5 8 261 1 -1 0 3 0 \n", "1 5 8 151 1 -1 0 3 0 \n", "2 5 8 76 1 -1 0 3 0 \n", "3 5 8 92 1 -1 0 3 0 \n", "4 5 8 198 1 -1 0 3 0 " ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Now lets revist basic operation on data frame\n", "\n", "df_main.head()\n", "\n", "# We can now see all data into numeric form" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "age int64\n", "job int64\n", "marital int64\n", "education int64\n", "default int64\n", "balance int64\n", "housing int64\n", "loan int64\n", "contact int64\n", "day int64\n", "month int64\n", "duration int64\n", "campaign int64\n", "pdays int64\n", "previous int64\n", "poutcome int64\n", "Target int64\n", "dtype: object" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Lets quickly print data types\n", "\n", "df_main.dtypes\n", "\n", "# Should print all int " ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countmeanstdmin25%50%75%max
age45211.040.93621010.61876218.033.039.048.095.0
job45211.04.3397623.2726570.01.04.07.011.0
marital45211.01.1677250.6082300.01.01.02.02.0
education45211.01.2248130.7479970.01.01.02.03.0
default45211.00.0180270.1330490.00.00.00.01.0
balance45211.01362.2720583044.765829-8019.072.0448.01428.0102127.0
housing45211.00.5558380.4968780.00.01.01.01.0
loan45211.00.1602260.3668200.00.00.00.01.0
contact45211.00.6402420.8979510.00.00.02.02.0
day45211.015.8064198.3224761.08.016.021.031.0
month45211.05.5230143.0069110.03.06.08.011.0
duration45211.0258.163080257.5278120.0103.0180.0319.04918.0
campaign45211.02.7638413.0980211.01.02.03.063.0
pdays45211.040.197828100.128746-1.0-1.0-1.0-1.0871.0
previous45211.00.5803232.3034410.00.00.00.0275.0
poutcome45211.02.5599740.9890590.03.03.03.03.0
Target45211.00.1169850.3214060.00.00.00.01.0
\n", "
" ], "text/plain": [ " count mean std min 25% 50% 75% \\\n", "age 45211.0 40.936210 10.618762 18.0 33.0 39.0 48.0 \n", "job 45211.0 4.339762 3.272657 0.0 1.0 4.0 7.0 \n", "marital 45211.0 1.167725 0.608230 0.0 1.0 1.0 2.0 \n", "education 45211.0 1.224813 0.747997 0.0 1.0 1.0 2.0 \n", "default 45211.0 0.018027 0.133049 0.0 0.0 0.0 0.0 \n", "balance 45211.0 1362.272058 3044.765829 -8019.0 72.0 448.0 1428.0 \n", "housing 45211.0 0.555838 0.496878 0.0 0.0 1.0 1.0 \n", "loan 45211.0 0.160226 0.366820 0.0 0.0 0.0 0.0 \n", "contact 45211.0 0.640242 0.897951 0.0 0.0 0.0 2.0 \n", "day 45211.0 15.806419 8.322476 1.0 8.0 16.0 21.0 \n", "month 45211.0 5.523014 3.006911 0.0 3.0 6.0 8.0 \n", "duration 45211.0 258.163080 257.527812 0.0 103.0 180.0 319.0 \n", "campaign 45211.0 2.763841 3.098021 1.0 1.0 2.0 3.0 \n", "pdays 45211.0 40.197828 100.128746 -1.0 -1.0 -1.0 -1.0 \n", "previous 45211.0 0.580323 2.303441 0.0 0.0 0.0 0.0 \n", "poutcome 45211.0 2.559974 0.989059 0.0 3.0 3.0 3.0 \n", "Target 45211.0 0.116985 0.321406 0.0 0.0 0.0 0.0 \n", "\n", " max \n", "age 95.0 \n", "job 11.0 \n", "marital 2.0 \n", "education 3.0 \n", "default 1.0 \n", "balance 102127.0 \n", "housing 1.0 \n", "loan 1.0 \n", "contact 2.0 \n", "day 31.0 \n", "month 11.0 \n", "duration 4918.0 \n", "campaign 63.0 \n", "pdays 871.0 \n", "previous 275.0 \n", "poutcome 3.0 \n", "Target 1.0 " ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Let analyse more about data, \n", "\n", "#df_main.describe() difficult to view hence lets apply transpose() to visually see it better\n", "\n", "df_main.describe().transpose()\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visual Analysis" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0 39922\n", "1 5289\n", "Name: Target, dtype: int64\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Lets see response distribution for target column\n", "print(df_main['Target'].value_counts())\n", "\n", "sns.countplot(x='Target',data=df_original)\n", "\n", "# Here we have a kind of improper data, what ever model we build would \n", "# be dominated by column have strong hold on `NO` on output variable becuase we see that data come from the range where\n", "# most of the people have not opted for Term Deposit" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[,\n", " ,\n", " ,\n", " ],\n", " [,\n", " ,\n", " ,\n", " ],\n", " [,\n", " ,\n", " ,\n", " ],\n", " [,\n", " ,\n", " ,\n", " ],\n", " [,\n", " ,\n", " ,\n", " ]],\n", " dtype=object)" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Histogram\n", "df_main.hist(figsize=(15,15))" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAEGCAYAAABrQF4qAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjMsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+AADFEAAAgAElEQVR4nO3de3Scd33n8fd3ZnS/32VbsuX7LffYuUJDMIGksBgKNOZ025ye7KZsm6Ut20voWWgbWNr07ELpIdttSmhp2hLSQMCAISUXAqQksRzHSXyNfJcly7Lud2lG3/1jxo6iSPbYlvTMjD6vc3Q88zy/mflqMvnMT7/n9/wec3dERCRzhYIuQEREZpeCXkQkwynoRUQynIJeRCTDKehFRDJcJOgCJqusrPSGhoagyxARSSs7duw47e5VU+1LKujN7Hbgy0AY+Kq7/+Wk/TnAPwHXAh3Ane5+JLHvCuDvgGJgHNjo7sPTvVZDQwONjY3JlCUiIglmdnS6fecdujGzMPAgcAewDvi4ma2b1OxuoMvdVwBfAh5IPDYC/DPwCXdfD7wLGLuI30FERC5SMmP01wFN7n7I3UeBR4HNk9psBr6euP04sMnMDHgv8Kq77wJw9w53j81M6SIikoxkgn4RcHzC/ebEtinbuHsU6AEqgFWAm9mTZvaymf3RpZcsIiIXIpkxepti2+R1E6ZrEwHeAWwEBoGnzWyHuz/9lgeb3QPcA7B48eIkShIRkWQl06NvBuon3K8DWqZrkxiXLwE6E9ufc/fT7j4IbAOumfwC7v6Qu29w9w1VVVMeNBYRkYuUTNBvB1aa2VIzywa2AFsntdkK3JW4/VHgGY+vlvYkcIWZ5Se+AG4B9sxM6SIikozzDt24e9TM7iUe2mHga+6+28zuBxrdfSvwMPCImTUR78lvSTy2y8y+SPzLwoFt7v6DWfpdRERkCpZqyxRv2LDBNY9eROTCJI5/bphqn5ZAEBHJcCm3BIJcvH1PH5ty+5pNmskkMp+pRy8ikuEU9CIiGU5BLyKS4RT0IiIZTkEvIpLhFPQiIhlOQS8ikuEU9CIiGU5BLyKS4RT0IiIZTkEvIpLhFPQiIhlOQS8ikuEU9CIiGU5BLyKS4RT0IiIZTkEvIpLhdIWpeUpXoxKZP9SjFxHJcAp6EZEMp6AXEclwCnoRkQynoBcRyXAKehGRDKegFxHJcEkFvZndbmb7zazJzO6bYn+OmX0zsf9FM2tIbG8wsyEzeyXx8/9mtnwRETmf854wZWZh4EHgNqAZ2G5mW919z4RmdwNd7r7CzLYADwB3JvYddPerZrhuERFJUjI9+uuAJnc/5O6jwKPA5kltNgNfT9x+HNhkZjZzZYqIyMVKJugXAccn3G9ObJuyjbtHgR6gIrFvqZntNLPnzOydU72Amd1jZo1m1tje3n5Bv4CIiJxbMkE/Vc/ck2zTCix296uBTwH/ambFb2vo/pC7b3D3DVVVVUmUJCIiyUom6JuB+gn364CW6dqYWQQoATrdfcTdOwDcfQdwEFh1qUWLiEjykgn67cBKM1tqZtnAFmDrpDZbgbsStz8KPOPubmZViYO5mNkyYCVwaGZKFxGRZJx31o27R83sXuBJIAx8zd13m9n9QKO7bwUeBh4xsyagk/iXAcAvAfebWRSIAZ9w987Z+EVERGRqSa1H7+7bgG2Ttn12wu1h4GNTPO5bwLcusUYREbkEOjNWRCTDKehFRDKcgl5EJMMp6EVEMpyCXkQkwynoRUQynIJeRCTDKehFRDKcgl5EJMMp6EVEMlxSSyDI/LHv6WNv27Zm0+IAKhGRmaIevYhIhlPQi4hkOAW9iEiGU9CLiGQ4BX0GcndGB8dwn3xpXxGZjzTrJoMM9Y5w7OV2eloHGB2MUlyTz4p3LAy6LBEJmII+Q/i4s//ZEwz1jFBeX0ReSTYtuzvZ+cRBsnIjXP7+pUGXKCIBUdBniDd+doL+00Os+qVFVK8sBaB2bTlNz7fw4r/sw0LGZXc0BFukiARCY/QZYHRwjO2P7qeoKo+qFSVnt2fnRVj77nqWbKzhhUf28sZPTwRYpYgERUGfAXZ+5yBDPaMsu3EBZvaWfRYybv2dK1m4voKfPvQarXs7A6pSRIKioE9zsbEYe398jOU3L6SoKm/KNpHsMO/51DUUVeXxkwd3Mdw/OsdVikiQNEaf4qZaewbeXH+mdV8X0ZEYy29cwGDX8LTPk50X4db/fhXf+9Nf8PO/f51Fl1e8rfcvIplJPfo01/xKO+GsEAvWlZ+3bdWyEjb86iqObG+j7UD3HFQnIqlAQZ/mjr/STu2acrJyk/vj7PL3L6V2bTlHtrcxNhyd5epEJBUo6NNY36lBeloHqL+qMunHWMi46a51REdjHHv51CxWJyKpIqmgN7PbzWy/mTWZ2X1T7M8xs28m9r9oZg2T9i82s34z+4OZKVsAju9qB6D+quoLelz54iIWrC2ndV8X/R3Tj+uLSGY4b9CbWRh4ELgDWAd83MzWTWp2N9Dl7iuALwEPTNr/JeCHl16uTHT8lXaKqvMors2/4McuuaaaSHaYQy+0ak0ckQyXTI/+OqDJ3Q+5+yjwKLB5UpvNwNcTtx8HNlliSoeZfQg4BOyemZIFIDoao2V3B/VXVV/U7JlITpgl11TTezI+/CMimSuZoF8EHJ9wvzmxbco27h4FeoAKMysA/hj480svVSZq299FbHSc+iuTH5+frGZVKdn5EY6/0j6DlYlIqkkm6KfqLk7+W3+6Nn8OfMnd+8/5Amb3mFmjmTW2tyt0knH6cA8A1SvLLvo5QpEQiy6voKd1kN62wZkqTURSTDJB3wzUT7hfB7RM18bMIkAJ0AlcD/yVmR0Bfg/4EzO7d/ILuPtD7r7B3TdUVVVd8C8xH50+3EtRVR45hVmX9Dy1q8uJ5IbVqxfJYMkE/XZgpZktNbNsYAuwdVKbrcBdidsfBZ7xuHe6e4O7NwB/DXzB3b8yQ7XPax1HeqlYWnzJzxPOCrFofQVdzf30nx6agcpEJNWcN+gTY+73Ak8Ce4HH3H23md1vZh9MNHuY+Jh8E/Ap4G1TMGXmREdj9LYNUtlQcv7GSViwrpxQJETLHi14JpKJkjqd0t23AdsmbfvshNvDwMfO8xx/dhH1yRQGEnPfZ6JHD/FFz6pXlHDqjW6WXl9DVo6WQBLJJDozNg31d8SHWCoaZiboAWrXlDMec069oTVwRDKNgj4N9Z8eJr8sh/ySnBl7zsKKXIqq82jd26UTqEQyjP5GT0MDHcMzNmwz0YK15Rx47gQ9LQOULio8b/vzLaEsIqlBPfo0E4uOM9gzQuXSmTkQO1Hl0mIiuWFdhUokwyjo08xA5zA4VM7g+PwZoXCImpWldB7rY3RwbMafX0SCoaBPMwOnEzNuZiHoAWpWleEOp5p6ZuX5RWTuKejTTH/HEJGcMAUVubPy/PmlORTX5HNyvw7KimQKBX2aGewaoaA8d1av91qzuozh3lF6T2r9G5FMoKBPI+7OYPcI+WUzN61yKpVLiwlnhTh5oGtWX0dE5oaCPo2MDkaJjY2TXzq7QR+OhKhaXkLH4V6iI7FZfS0RmX0K+jQy2DUCMOtBD1C7uozxmNOmM2VF0p6CPo0Mds9d0BdW5lFUlcfJfZ06KCuS5hT0aWSwe4RIbpisvLk5oXnBunKGekZpeb1jTl5PRGaHgj6NDHYNz0lv/owzZ8ru+fHUSx2ISHpQ0KeJszNu5jDoQ+EQtavKOLaj7eyKmSKSfhT0aWJsKEpsdPZn3ExWu6YMB/aqVy+SthT0aeLsgdhZnkM/WW5RNg0ba9j71DGtfyOSphT0aWIup1ZOduUHlzM6GGXf08fn/LVF5NIp6NPEYPcI4ezQnM24mahqWQkLL6vgtR8eITqqE6hE0o2CPk2cORA7m2vcnMtVm5cz1D1C089OBPL6InLxdIWpNDHYPULF4qLAXn/BunKqlpWw63uHWHVLHaHIhfURdDUqkeAo6NPA2FCU6HBszg/ETmRmXPXh5fz4/7zMgeeaZzWg9aUgMrM0dJMG3lz6YHbWoE/W4muqqVlVxsvfbtJiZyJpREGfBuZyjZtzMTM2blnFYNcIu//9aKC1iEjyFPRpYLB7hHBWiOyC4EfaateUU391Fbu+e1C9epE0oaBPA4NdI+QFOONmsg13rmJ0KMqxne1BlyIiSVDQp4G5XuPmfCoWF7Pm1npa93ScHVYSkdSVVNCb2e1mtt/Mmszsvin255jZNxP7XzSzhsT268zslcTPLjP78MyWn/nGRqKMDUVTKugBrv3VlYSyQhx6oVXr1YukuPMGvZmFgQeBO4B1wMfNbN2kZncDXe6+AvgS8EBi++vABne/Crgd+DszC36gOY0MdY8CwR+InSyvOIcl11TTfWKAzmN9QZcjIueQTI/+OqDJ3Q+5+yjwKLB5UpvNwNcTtx8HNpmZufugu0cT23MBdf0u0GDXMDD3i5klo3ZtOfmlORx64SSxsfGgyxGRaSQT9IuAiatZNSe2TdkmEew9QAWAmV1vZruB14BPTAj+s8zsHjNrNLPG9nYd4JtosHuEUNjIKcwKupS3CYWM5TcvYKR/jGMvnwq6HBGZRjJBP9VUj8k982nbuPuL7r4e2Ah82szedtaPuz/k7hvcfUNVVVUSJc0fg92pNeNmspLaAmpWl3Fidwf9p3VxEpFUlEzQNwP1E+7XAS3TtUmMwZcAnRMbuPteYAC47GKLnY9SbcbNVJZurCErN0LT8y34uEbnRFJNMkG/HVhpZkvNLBvYAmyd1GYrcFfi9keBZ9zdE4+JAJjZEmA1cGRGKp8HRgfHGB2IpuT4/ESRnDDLbqil//Qwza+dDrocEZnkvDNg3D1qZvcCTwJh4GvuvtvM7gca3X0r8DDwiJk1Ee/Jb0k8/B3AfWY2BowDv+3uSoIkdbcMAKk342YqlUuL6ThSzLGX2+k40ktFQ3HQJYlIQlJTHd19G7Bt0rbPTrg9DHxsisc9AjxyiTXOW93N/cDUQT/dCo9BMTOW37SAnpOD/ORvd7H5czcRyQ4HXZaIoDNjU1rn8T5CYSO3KDvoUpKSlRth5TsX0nW8n5e+sT/ockQkQUGfwjqO9pJfnouFUnPGzVTK64u47I4G9jx5lDd+qqtRiaQCnaWaotydjiO9lNdf+lWl5nqY57qPr6bjaC8/f/h1yuoKqVxWMmuvpYuUiJyfevQpqv/0EKODUQoqgr3YyMUIRUK8+5NXkVeSzb9/cQc9JweCLklkXlPQp6iOI70AFKZh0EN8LZzb/uBaYqPj/ODzLzLUo1UuRYKioE9RHUf7MIP88vQMeogvZ/z+/3k941HntW1H6D01GHRJIvOSgj5FdRzppWRhAeFIev8nKl9cxPv/53WYGa9+7zAHf9FKdFRXphKZSzoYm6I6jvZSu7os6DJmRFldEVf/ynKO7jhF655O2g50U15fSMWSYuqvHqYgjf9qEUkHCvoUNNw3ykDHMBVLMufs0kh2mOU3LqB6RQlt+7vpONrL6cO97P9JMwUVuSzZUMPaTfWU1V36LCMReSsFfQrqOBo/EFvRUExfho1rF1XlU1SVz/KbFtDfMUReSS4n93Wy76lj7HnyKHVXVrJwXTnZ+TO/LLOmYsp8paBPQWdm3FQsybygP8NCRlFVPms2LeayOxoY6hnhwHPN7HziIG0Huln9rjpKFxYEXaZIRkjvI30ZquNoHwXlueQWp8fSBzMhrySHKz+4nM2fu5FIdojXf3SEUwe7gy5LJCMo6FNQx5FeypfMz7Hqsroirtq8jJLafN547gRdiYXdROTiaegmAOcaKx4ZGKO7pZ/lNy6Y46pSRzgrzNr3LObVHxxm3zPHufyXGyiszAu6LJG0pR59ijnV1A0ONRkytfJiRbLDrH/vEiI5Yfb8+BgjA2NBlySSthT0KebUgS4sZFQtn72FwNJFTkEW625bTHR0nH3PHCc2phOtRC6Ggj7FtB3opnxxEVm5GlUDKCjPZdUvLaTv1BC/+PreoMsRSUsK+hQyHhvnVFM3Navm97DNZJVLS6i7opJ9zxzn9R8dCbockbSjbmMK6TzWR3QkRs2q0qBLSTlLrq0mnBPmhUf2UlSVx5Jra4IuSSRtqEefQtoOdAGoRz8FCxm3/vaVVC0t4dmv7KJdc+xFkqYefQppO9BNQXmuphJOI5IT5rY/vJbvffYX/PAvtnP7H2+YtdfScgmSSRT0KaTtQBfVGrY5p/ySHN7/mevZ9r9e4od/sZ01m+opqZ27pRKm+gJQ+Euq09BNihjpH2OgY1jDNkkorMzj/Z+5nvyyXHb/6OjZIS8RmZqCPkX0tMWvq1qroE9KQXkuH/jT6ymqyeeNn7XQ9HwLseh40GWJpCQFfYroaRkgOz9CeUPmrEE/2/KKc7jsfUuou6KSk/u62PnEQbpbdSFykck0Rp8iulsHWLCuglDIgi7lbaY7MJkKLGQ0bKyhdFEBTT9v5fVtR6heWcria6rJL9OVq0RAPfqUMNw3ykjfGAvXVwRdStoqXVjI1b+ynLorKmk/2MNjn/opr3znoK5PK0KSQW9mt5vZfjNrMrP7ptifY2bfTOx/0cwaEttvM7MdZvZa4t93z2z5maG7JT7csHB9ecCVpLdwJETDxhqu+cgKFl1RSeNjB3j8D3/G4RdbcfegyxMJzHmHbswsDDwI3AY0A9vNbKu775nQ7G6gy91XmNkW4AHgTuA08J/cvcXMLgOeBBbN9C+R7npaB8jKi1C6qDDoUjJCXnE2t/3+NbTs7uCFR/by9JdfoXZNGbWry3SOgsxLyfTorwOa3P2Qu48CjwKbJ7XZDHw9cftxYJOZmbvvdPeWxPbdQK6Z5cxE4ZnC3eluGaB0QQFmqTc+n84Wrq/gQ1+4mZvvXk/3iX5e+e4hmn7eouEcmXeSCfpFwPEJ95t5e6/8bBt3jwI9wOQB548AO919ZPILmNk9ZtZoZo3t7e3J1p4RhrpHGBuKUqLro86KUMhYu2kxH/viLSxcX8HJ/V288p1DGXstXpGpJBP0U3UzJw94nrONma0nPpzzW1O9gLs/5O4b3H1DVVVVEiVljjPTAXUh7NmVU5DFshtqufz9Dfi4s+v7hzm+q11j9zIvJBP0zUD9hPt1QMt0bcwsApQAnYn7dcATwG+4+8FLLTjTdLcMkFOYRW7R/LkQeJBKagu4+sPLqWwo5mjjKXY/eZTBnrf9kSmSUZIJ+u3ASjNbambZwBZg66Q2W4G7Erc/Cjzj7m5mpcAPgE+7+/MzVXSm8HGnp3VAvfk5FskJs/rWOpbfvIDek4M88ennadndEXRZIrPmvEGfGHO/l/iMmb3AY+6+28zuN7MPJpo9DFSYWRPwKeDMFMx7gRXAZ8zslcRP9Yz/FmlqoHOY2Oi4xucDYGYsWFPOlR9cRnZ+hG1feInGxw7oQK1kpKTOjHX3bcC2Sds+O+H2MPCxKR73eeDzl1hjxjozf750gYI+KAXluXzo8zfxH/+4h1e+c5CDz7ew8eOrcXfNgpKMoSUQAtTdMkB+aQ7Z+VlBlzKvZeVGuOUTV7D85oW8+M/7eOZvXiGvJJua1WVUryglO0//m0h60yc4IOOxcXrbBjJyWeJUXhvnXOour2ThX9zMwedbePlbb3DkpTaOvNRGYWUuZfVFVDYUU1Cu9XMk/SjoA9LXPsR41CnRsE1KCYWMle9cRGw0xmDXMKeP9NHV3MfxV9o5vrOd/LIcFq4rp2Z1mYZ2JG0o6APSo/F5ILV7//lluSwuy2Xx1VWMDkU5fbiHU2900/R8K20Hull+80IKK9TDl9Sn1SsD0t0yQGFlLpGccNClSBKy8yIsXFfBlR9cxqpfWsRw3yi7th46e0BdJJUp6AMQi47T1z6kYZs0ZGZUryzlmo+sIK84m71PH6OruS/oskTOSUEfgL5Tg/i4xufTWVZuhHXvXUwoHOLJv2pksGs46JJEpqWgD0DPyUEwKK7JD7oUuQS5Rdmsf+9ihnpHeekb+4MuR2RaCvoA9LQOUFCeSyRb4/PprrAyj3W3LeHg8y306Hq1kqIU9HMsNhaLj8/XatgmU1zxgaWEskLsfKIp6FJEpqSgn2PtB3vwmFNSq2GbTJFXkqNevaQ0Bf0ca93XCUCxgj6jqFcvqUwnTM2xk3u7yC/LISv37W99Kp88JOeWV5LD6nfVs//Z49z0m+u1Po6kFPXo59B4bJy2A10atslQy26oJTY2zvGdp4IuReQt1O2YQ6cP9xIdiVGsA7EXJdX/4qlZVUZ+aQ5HXmpj+U0Lgy5H5Cz16OfQycT4vHr0mclCxpKNNRzf1U50RBcwkdShoJ9DrXs7KVlQoPXnM9jS62qJjsQ4vqs96FJEztLQzRwZH3fa9nex9PraoEuZF4Ia5qldU0ZOYRZHXjrJ0uv031pSg3r0c6TreB+jg1Fq15YHXYrMolA4RMPGGo7tPEVsTMM3khoU9HOkdW98fH7BGgV9pmvYWMvYUIyWPZ1BlyICKOjnzMl9nRRW5lFYmRd0KTLLFqwtJxQ2Wvd0BF2KCKCgnxPuzsl9XSzQsM28EMkJU7WilNbd6tFLalDQz4HulgGGe0epXZN5FwKXqS1cV87pwz2MDo4FXYqIZt3MhTPz53UgNjXNxgydBesq2PnEQU7u72Lx1dUz/vwiF0I9+jlwcm8n+aU5utDIPFK9spRwVohWHZCVFKAe/Sxzd1r3dlK7thwzC7ocmQXT/UVQvbKUlt06ICvBS6pHb2a3m9l+M2sys/um2J9jZt9M7H/RzBoS2yvM7Fkz6zezr8xs6emhu2WAwa4RFq6vCLoUmWML1pbTcbSXkX6N00uwztujN7Mw8CBwG9AMbDezre6+Z0Kzu4Eud19hZluAB4A7gWHgM8BliZ95p+X1eI9u0WUK+vlmbCgKDo2P7adiSTEAazYtDrgqmY+S6dFfBzS5+yF3HwUeBTZParMZ+Hri9uPAJjMzdx9w958TD/x5qeX10xRV5VFUrfH5+aaoOo9Q2HTVKQlcMkG/CDg+4X5zYtuUbdw9CvQASXdhzeweM2s0s8b29sxZDGo8Nk7r3k4WXlYZdCkSgFA4RFF1Pj0nB4MuRea5ZIJ+qiOIfhFtpuXuD7n7BnffUFVVlezDUt7pw72MDkZZqGGbeau4Np+BzmGio1r3RoKTTNA3A/UT7tcBLdO1MbMIUALM+3llJ14/DcDC9Zo/P1+V1OaDQ2+bevUSnGSCfjuw0syWmlk2sAXYOqnNVuCuxO2PAs+4e9I9+kzV8noH5UuKyCvOCboUCUhRdT5mCnoJ1nln3bh71MzuBZ4EwsDX3H23md0PNLr7VuBh4BEzayLek99y5vFmdgQoBrLN7EPAeyfN2MlI0ZEYbQe6WP++hqBLkQCFIyEKK/Po1Ti9BCipE6bcfRuwbdK2z064PQx8bJrHNlxCfWmrdV8n41HX/HmhuDaflt2dxKLjQZci85SWQJglRxvbiOSEWbBO4/PzXUltAT7u9LcPBV2KzFMK+lng487RHaeov7KKSHY46HIkYGfWONI0SwmKgn4WnGrqZqh7hCUba4IuRVJAJCdMfnkOvSd14pQEQ4uazYIj29sIhY3FV1cFdpFqSS0ltQW0HehmPDpOKKL+lcwtBf0MORPo7s4bPz1BcW0Bh37RGnBVkipKFhTQuqeTU03d1Oq6wTLH1LWYYYNdIwz3jVLRUBR0KZJCShcWgEHzq6eDLkXmIQX9DOs42gtAxWIFvbwpkh2muDqf5l2Zs5aTpA8F/Qxyd0419VBck092flbQ5UiKKasr5PThXoZ6RoIuReYZBf0M6tFFwOUcyuoKAWh+TcM3MrcU9DOodV8XkdwwlUuLgy5FUlBBRS65xdk071LQy9xS0M+QkYExOo72UrOylFBYb6u8nZlRd0UlJ15tx8fn/Zp/MoeUSDOk7UAXOBq2kXOqu6KS4b4xTh/pDboUmUcU9DNgPDpO2/5uShcWaEliOadFV1RiBkdeOhl0KTKPKOhnwL5njzMyMKaVKuW88opzqL+6mgPPNTOu1SxljijoL9HoUJSXv9VEcW0+ZfWFQZcjaWDtpnqGekY5uqMt6FJknlDQX6LXfnCY4d5Rll5Xg9lUl84VeatFV1ZRWJnH3qeOB12KzBMK+ksw2DXMaz84zNLraymqyg+6HEkToZCx+t11tOzuoKdVK1rK7FPQXyR35/l/2MN4dJwNd64KuhxJM6tvqcPCxr5n1KuX2aegv0h7f3yMo41tbNyympLagqDLkTSTX5ZLw8Ya9j51TL16mXUK+ovQcaSXF/55L/VXV3HZLzcEXY6kqRt+bQ3hSIhnH9yl68nKrFLQX6CBjiGe+tLL5BZlc8tvXaEDsHLRCiryeMd/vYzTh3p4+fE3gi5HMpiC/gIMdA7zg8+/xHD/GO/5/WvILc4OuiRJc0uvq2X1rXXs+t4hXvnuQcZj6tnLzNMVppLU2zbAjx5oZKh3hDvu20j1itKgS5IMccOvr6XjSC+N3zzA/meOs+yGWgqr8lj7niVBlyYZQkF/DmcuD3j6cC9v/OwEZsa69y6m81gfncf6Aq5OMkVWboQ1766n/WAPB/+jlV3fO0xOYRZdx/spX1JMWV0hZXWFusaBXDQF/TmMDkU52thG24FuCqvyWHNrHblFGq6R2VG1vISyukI6jvVx+nAP+59rJjb65lBOfnkOpQsKiY3FyCnMJrcoi9yibHIKs7jiA8uw0NuPF+17+hjuznjMiY7EiI3GWLKhFoBQxM4+j1ZczWwK+imMDUfZ+9QxdvzbG4xHx1l0eQVLrq3W/wwy6yI5YWpWllKzspTVt9bT1z5EV3Mf3Sf66Wrup/fkIF0n+hkbir7lcS9/q4n80hwiuREiOSFio+OMDkUZ7h0lNhrDJ6yK/PK3D77tdXMKsrCwkZ0XITs/8ZMXYcmGWvLLcsgvyyG3OJus3AjhrJAmIaQZBf0E/R1D7Hv6OHt/fIyRgTHK6gpZen0t+aVakVJmxpnhwGRYyCiuyae4Jp8l19a85Tli0XFG+scY7htlpH+M/PJcBrtGiI7EiI7EiJSEyMqLMNAxTDg7RCQrRDgnTCQrTP3V1QCJ59EJ4gwAAAgvSURBVBhluHeU4b4xTr3RxdhwlIHOYbqao8TGxjnSeGqKwiAcCRGKhAhHjFDidihsZ7eHIvHbWTlhFl5WSU5hVuIn+83bBVmEI+o8zQVzP/8FEMzsduDLQBj4qrv/5aT9OcA/AdcCHcCd7n4kse/TwN1ADPikuz95rtfasGGDNzY2XvhvcpH62gdpfvU0r287TE/rIADlS4qou7yS4hotayDpYc2mxVNuv5AvlsliYzHqrqxmsHuEwa5hhvtGiY7EGBuOceqNbmLRccaj44zHxhmP+pv3E7djY+NER2NwjogJZ4WI5ITP/pTXF539EsgpyianIEJ2fvx+dkGEnPwssguyyM6LTDlUNZ+Z2Q533zDVvvP26M0sDDwI3AY0A9vNbKu775nQ7G6gy91XmNkW4AHgTjNbB2wB1gMLgafMbJW7xy7tV0qOuzMeHWdsKMbYcPTsB7a3bZCOo32cPtRDb1s83HOLs1l8TRXVy0s1bVLSzqUE+nTCWWFa93S8eT8SIhwJkVOQRWFFbVLP4e4su3EhI/1jjAyMMdI/evZ28672s3+BjCX+7Tzed3a/x87dCc3Oj5BdkEVO4t/sgsQXQn4k8W/8yyGSHU78pZH46yP81r9EwhGLD8saGIDZ1LchcTux7cw+i7ezaba9ZV9Akhm6uQ5ocvdDAGb2KLAZmBj0m4E/S9x+HPiKxX+rzcCj7j4CHDazpsTz/WJmyn/T6UM9fP9zL+Ljzvi4xy/Vdo7PSWFVHhVLiln3viUsurySk3s7NO4oMsPMjMMvtL5teyhkLE4MIU3F3eN/EYzEiI7G/zKIjcSIjp75GaeoOp/RgfiXwuhglN6TA4wMjDHcO5a6a/1P+AKJfwlM/HIwll5fyy2fuGLGXzaZoF8ETFx5qRm4fro27h41sx6gIrH9hUmPXTT5BczsHuCexN1+M9ufVPVzpxLQFZ2np/dnenpvzk3vz0T/CPy3t2y5kPdn2hMvkgn6qbq5k/vK07VJ5rG4+0PAQ0nUEggza5xu7Ev0/pyL3ptz0/tzbjP1/iRzyLsZqJ9wvw5oma6NmUWAEqAzyceKiMgsSibotwMrzWypmWUTP7i6dVKbrcBdidsfBZ7x+HSercAWM8sxs6XASuClmSldRESScd6hm8SY+73Ak8SnV37N3Xeb2f1Ao7tvBR4GHkkcbO0k/mVAot1jxA/cRoHfmasZNzMsZYeVUoTen+npvTk3vT/nNiPvT1Lz6EVEJH3ptDQRkQynoBcRyXAK+gnMrN7MnjWzvWa228x+N7G93Mx+bGZvJP4tC7rWIJlZ2Mx2mtn3E/eXmtmLiffnm4mD9vOSmZWa2eNmti/xObpRn584M/v9xP9Xr5vZN8wsdz5/dszsa2Z2ysxen7Btys+Kxf2NmTWZ2atmds2FvJaC/q2iwP9w97XADcDvJJZxuA942t1XAk8n7s9nvwvsnXD/AeBLifeni/iSGPPVl4Efufsa4Eri79O8//yY2SLgk8AGd7+M+MSOM8ulzNfPzj8Ct0/aNt1n5Q7isxZXEj+59G8v6JXcXT/T/ADfJb7Gz35gQWLbAmB/0LUF+J7UJT6A7wa+T/ykuNNAJLH/RuDJoOsM6L0pBg6TmOQwYfu8//zw5tnz5cRn+30feN98/+wADcDr5/usAH8HfHyqdsn8qEc/DTNrAK4GXgRq3L0VIPHv9It0ZL6/Bv4IOLOYSAXQ7e5nFkifcpmLeWIZ0A78Q2Jo66tmVoA+P7j7CeB/A8eAVqAH2IE+O5NN91mZaimapN8rBf0UzKwQ+Bbwe+7eG3Q9qcLMPgCccvcdEzdP0XS+ztmNANcAf+vuVwMDzMNhmqkkxpo3A0uJr2RbQHw4YrL5+tk5n0v6/0xBP4mZZREP+X9x928nNreZ2YLE/gXAFFdjmBduBj5oZkeAR4kP3/w1UJpY+gLm9zIXzUCzu7+YuP848eDX5wfeAxx293Z3HwO+DdyEPjuTTfdZuaTlZBT0EySWVn4Y2OvuX5ywa+ISD3cRH7ufd9z90+5e5+4NxA+kPePuvwY8S3zpC5jf789J4LiZrU5s2kT8rHB9fuJDNjeYWX7i/7Mz740+O2813WdlK/Abidk3NwA9Z4Z4kqEzYycws3cAPwNe480x6D8hPk7/GLCY+Af2Y+7eGUiRKcLM3gX8gbt/wMyWEe/hlwM7gf/s8WsQzDtmdhXwVSAbOAT8JvEO1bz//JjZnwN3Ep/dthP4L8THmeflZ8fMvgG8i/hSxG3AnwLfYYrPSuLL8SvEZ+kMAr/p7klfik9BLyKS4TR0IyKS4RT0IiIZTkEvIpLhFPQiIhlOQS8ikuEU9CIiGU5BLyKS4RT0IhOY2XfMbEdi3fR7EtvuNrMDZvYTM/t7M/tKYnuVmX3LzLYnfm4OtnqRqemEKZEJzKw8cSZiHrCd+FK6zxNfs6YPeAbY5e73mtm/Av/X3X9uZouJL7G7NrDiRaYROX8TkXnlk2b24cTteuDXgefOLFlgZv8GrErsfw+wLn52OgDFZlbk7n1zWbDI+SjoRRIS6/e8B7jR3QfN7CfEL/AwXS89lGg7NDcVilwcjdGLvKkE6EqE/Bril5PMB24xs7LEcrofmdD+34F7z9xJLGgmknIU9CJv+hEQMbNXgc8BLwAngC8QX8H0KeJL6/Yk2n8S2JC4WPMe4BNzX7LI+elgrMh5mFmhu/cnevRPAF9z9yeCrkskWerRi5zfn5nZK8DrxC/+/Z2A6xG5IOrRi4hkOPXoRUQynIJeRCTDKehFRDKcgl5EJMMp6EVEMtz/B2rJYHqQNYnpAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "sns.distplot(df_main['age'],kde=True)\n", "\n", "# We can conclude that data set has people raning from 20-60 and there are some outliers present as well\n", "# becuase the tail on right is spreading a little more" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['admin.' 'blue-collar' 'entrepreneur' 'housemaid' 'management' 'retired'\n", " 'self-employed' 'services' 'student' 'technician' 'unemployed' 'unknown']\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "sns.distplot(df_main['job'])\n", "# Looking at the graph we see that are multiple groups present in here, we have seen this and converted to categorical variable\n", "# lets print it from cached map `col`umnEncoders`\n", "print(columnEncoders['job'].classes_)" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "management\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Relation between age, job to Term deposit\n", "sns.barplot('job','age',hue='Target',data=df_main,ci=None)\n", "print(columnEncoders['job'].classes_[4])\n", "\n", "# We can see that management people have opted for term deposit more" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Targetnoyes
job
admin.4540631
blue-collar9024708
entrepreneur1364123
housemaid1131109
management81571301
retired1748516
self-employed1392187
services3785369
student669269
technician6757840
unemployed1101202
unknown25434
\n", "
" ], "text/plain": [ "Target no yes\n", "job \n", "admin. 4540 631\n", "blue-collar 9024 708\n", "entrepreneur 1364 123\n", "housemaid 1131 109\n", "management 8157 1301\n", "retired 1748 516\n", "self-employed 1392 187\n", "services 3785 369\n", "student 669 269\n", "technician 6757 840\n", "unemployed 1101 202\n", "unknown 254 34" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Finding relation between job and term deposit\n", "pd.crosstab(df_original['job'], df_original['Target'])\n", "\n", "# We here conclude that management, technician, blue-collar are some of the categories that tend to apply for term deposit.\n", "# This conclusion is based on that fact that their earning is on higher side. A general human assumption." ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['primary' 'secondary' 'tertiary' 'unknown']\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Lets analyse education-wise which category tend to apply more for term deposit respectively\n", "sns.countplot(x='education', hue='Target',data=df_main)\n", "print(columnEncoders['education'].classes_)\n", "\n", "# Here we conclude that the order of applyterm deposit secondary > tertiary > primary > unknown" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['divorced' 'married' 'single']\n" ] }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZEAAAEGCAYAAACkQqisAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjMsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+AADFEAAAWiUlEQVR4nO3dfbCU9X338fc3PBQS8QHEVD0YiCFOIamgJ6A1Oka9Bcn41EQrnUaMJjgOZsxM48RmMoGg9m7vaWJqtcxgQsV7rGjMg+igSBytU6sCPkR5KDdEc8tRIghqsGgM9Ns/9jq4gRWXi7O7Zznv18zO7vXd33Xtb+eoH3/X9bt+G5mJJEllfKjVHZAktS9DRJJUmiEiSSrNEJEklWaISJJK69/qDjTboYcemiNHjmx1NySprTz11FOvZebwXet9LkRGjhzJ8uXLW90NSWorEfH/a9U9nSVJKq1hIRIRIyLi4YhYHRErI+Kqoj4rIl6OiGeLx5Sqff4mItZFxJqImFRVn1zU1kXENVX1URHxZESsjYg7I2Jgo76PJGl3jRyJbAf+OjP/BDgBmBERY4r3bsjMccVjEUDx3kXAWGAy8M8R0S8i+gE3A2cBY4CpVcf5++JYo4HXgcsa+H0kSbto2DWRzNwAbCheb42I1cCRe9jlXGBBZv4OeDEi1gETivfWZeYLABGxADi3ON5pwF8WbeYDs4A5Pf1dJGlf/f73v6erq4t33nmn1V3Zo0GDBtHR0cGAAQPqat+UC+sRMRIYDzwJnARcGREXA8upjFZepxIwT1Tt1sV7obN+l/pEYBjwRmZur9F+18+fDkwHOOqoo/b9C0nSXurq6mLIkCGMHDmSiGh1d2rKTDZv3kxXVxejRo2qa5+GX1iPiAOAnwBfz8zfUhkpHA2MozJS+V530xq7Z4n67sXMuZnZmZmdw4fvNkNNkhrunXfeYdiwYb02QAAigmHDhu3VaKmhI5GIGEAlQG7PzJ8CZOarVe/fAtxXbHYBI6p27wBeKV7Xqr8GHBwR/YvRSHV7Sep1enOAdNvbPjZydlYAPwJWZ+b3q+qHVzU7H1hRvF4IXBQRfxQRo4DRwFJgGTC6mIk1kMrF94VZWcP+YeCLxf7TgHsa9X0kSbtr5EjkJOBLwPMR8WxR+xaV2VXjqJx6+jVwOUBmroyIu4BVVGZ2zcjMHQARcSWwGOgHzMvMlcXxvgksiIjrgGeohJYktYXNmzdz+umnA/Cb3/yGfv360X3KfenSpQwc2PN3LTz99NNs3LiRyZMn98jxGjk769+pfd1i0R72uR64vkZ9Ua39ihlbE3atS3Om/7jVXdhrV8y9oNVdUJMNGzaMZ5+t/D/2rFmzOOCAA/jGN75R9/47duygX79+e/WZTz/9NCtWrOixEPGOdUnqhc4++2yOP/54xo4dyw9/+EMAtm/fzsEHH8y3v/1tJkyYwNKlS1m4cCHHHHMMJ598Ml/72tc477zzAHjrrbe45JJLmDBhAuPHj+fee+/l7bffZvbs2dx+++2MGzeOu+++e5/72efWzpKkdjB//nyGDh3Ktm3b6Ozs5Atf+AJDhgzhzTff5LjjjuO6665j27ZtfPKTn+Sxxx7jqKOO4sILL9y5/+zZs5k8eTK33norr7/+OhMnTuS5557jO9/5DitWrOAHP/hBj/TTkYgk9UI33HADxx57LCeeeCJdXV386le/AmDgwIGcf/75AKxatYpjjjmGj33sY0QEU6dO3bn/gw8+yPXXX8+4ceP43Oc+xzvvvMNLL73U4/10JCJJvcwvfvELHn30UZ544gkGDx7MZz/72Z33bgwePHjnNNzKJNXaMpOf//znHH300X9Qf/TRR3u0r45EJKmXefPNNxk6dCiDBw9m5cqVLFu2rGa7sWPHsmbNGtavX09mcuedd+58b9KkSdx44407t5955hkAhgwZwtatW3usr4aIJPUyn//859m2bRvHHnsss2fPZuLEiTXbffjDH+amm27ijDPO4OSTT+aII47goIMOAmDmzJls27aNT3/604wdO5ZZs2YBcNppp/HLX/6S8ePHe2FdkvYX3f+Rh8oiiIsXL67Z7o033viD7TPOOIM1a9aQmVx++eV0dnYC8JGPfIRbbrllt/2HDx/eoz/M50hEktrYnDlzGDduHGPGjOHtt9/mq1/9alM/35GIJLWxq6++mquvvrpln+9IRJJUmiEiSSrNEJEklWaISJJK88K6JLVAT680Xe8q0A888ABXXXUVO3bs4Ctf+QrXXHPNPn2uIxFJ6iN27NjBjBkzuP/++1m1ahV33HEHq1at2qdjGiKS1EcsXbqUT3ziE3z84x9n4MCBXHTRRdxzz779IKwhIkl9xMsvv8yIESN2bnd0dPDyyy/v0zENEUnqI2qt+tu9InBZhogk9REdHR2sX79+53ZXVxdHHHHEPh3TEJGkPuIzn/kMa9eu5cUXX+Tdd99lwYIFnHPOOft0TKf4SlIL1Dsltyf179+fm266iUmTJrFjxw4uvfRSxo4du2/H7KG+SZLawJQpU5gyZUqPHc/TWZKk0gwRSVJphogkqTRDRJJUmiEiSSrNEJEkleYUX0lqgSk3P96jx1s048QPbHPppZdy3333cdhhh7FixYoe+VxHIpLUR1xyySU88MADPXpMQ0SS+ohTTjmFoUOH9ugxDRFJUmmGiCSptIaFSESMiIiHI2J1RKyMiKuK+tCIWBIRa4vnQ4p6RMSNEbEuIp6LiOOqjjWtaL82IqZV1Y+PiOeLfW6MfV0YX5K0Vxo5EtkO/HVm/glwAjAjIsYA1wAPZeZo4KFiG+AsYHTxmA7MgUroADOBicAEYGZ38BRtplftN7mB30eStIuGTfHNzA3AhuL11ohYDRwJnAucWjSbDzwCfLOo35aVn956IiIOjojDi7ZLMnMLQEQsASZHxCPAgZn5eFG/DTgPuL9R30mSeko9U3J72tSpU3nkkUd47bXX6Ojo4Lvf/S6XXXbZPh2zKfeJRMRIYDzwJPDRImDIzA0RcVjR7EhgfdVuXUVtT/WuGvVanz+dyoiFo446at++jCS1qTvuuKPHj9nwC+sRcQDwE+DrmfnbPTWtUcsS9d2LmXMzszMzO4cPH/5BXZYk1amhIRIRA6gEyO2Z+dOi/GpxmorieWNR7wJGVO3eAbzyAfWOGnVJUpM0cnZWAD8CVmfm96veWgh0z7CaBtxTVb+4mKV1AvBmcdprMXBmRBxSXFA/E1hcvLc1Ik4oPuviqmNJUq9TueTbu+1tHxt5TeQk4EvA8xHxbFH7FvB3wF0RcRnwEtD9Q8OLgCnAOmAb8GWAzNwSEdcCy4p2s7svsgNXALcCg6lcUPeiuqReadCgQWzevJlhw4bRW+9GyEw2b97MoEGD6t6nkbOz/p3a1y0ATq/RPoEZ73OsecC8GvXlwKf2oZuS1BQdHR10dXWxadOmVndljwYNGkRHR8cHNyy4iq8kNcGAAQMYNWpUq7vR41z2RJJUmiEiSSrNEJEklWaISJJKM0QkSaUZIpKk0gwRSVJphogkqTRDRJJUmiEiSSrNEJEklWaISJJKM0QkSaUZIpKk0gwRSVJphogkqTRDRJJUmiEiSSrNEJEklWaISJJKM0QkSaUZIpKk0gwRSVJphogkqTRDRJJUmiEiSSrNEJEklWaISJJKM0QkSaUZIpKk0gwRSVJphogkqTRDRJJUWsNCJCLmRcTGiFhRVZsVES9HxLPFY0rVe38TEesiYk1ETKqqTy5q6yLimqr6qIh4MiLWRsSdETGwUd9FklRbI0citwKTa9RvyMxxxWMRQESMAS4Cxhb7/HNE9IuIfsDNwFnAGGBq0Rbg74tjjQZeBy5r4HeRJNXQsBDJzEeBLXU2PxdYkJm/y8wXgXXAhOKxLjNfyMx3gQXAuRERwGnA3cX+84HzevQLSJI+UCuuiVwZEc8Vp7sOKWpHAuur2nQVtferDwPeyMztu9RriojpEbE8IpZv2rSpp76HJPV5zQ6ROcDRwDhgA/C9oh412maJek2ZOTczOzOzc/jw4XvXY0nS++rfzA/LzFe7X0fELcB9xWYXMKKqaQfwSvG6Vv014OCI6F+MRqrbS5KapKkjkYg4vGrzfKB75tZC4KKI+KOIGAWMBpYCy4DRxUysgVQuvi/MzAQeBr5Y7D8NuKcZ30GS9J6GjUQi4g7gVODQiOgCZgKnRsQ4Kqeefg1cDpCZKyPiLmAVsB2YkZk7iuNcCSwG+gHzMnNl8RHfBBZExHXAM8CPGvVdJEm1NSxEMnNqjfL7/oc+M68Hrq9RXwQsqlF/gcrsLUlSi3jHuiSptLpCJCIeqqcmSepb9ng6KyIGAR+mcl3jEN6bWnsgcESD+yZJ6uU+6JrI5cDXqQTGU7wXIr+lshyJJKkP22OIZOY/Av8YEV/LzH9qUp8kSW2irtlZmflPEfFnwMjqfTLztgb1S5LUBuoKkYj4v1SWK3kW2FGUEzBEJKkPq/c+kU5gTHGnuCRJQP33iawA/riRHZEktZ96RyKHAqsiYinwu+5iZp7TkF5JktpCvSEyq5GdkCS1p3pnZ/1bozsiSWo/9c7O2sp7P/o0EBgA/FdmHtiojkmSer96RyJDqrcj4jxcQVeS+rxSq/hm5s+B03q4L5KkNlPv6aw/r9r8EJX7RrxnRJL6uHpnZ51d9Xo7lV8lPLfHeyNJaiv1XhP5cqM7IklqP/X+KFVHRPwsIjZGxKsR8ZOI6Gh05yRJvVu9F9b/BVhI5XdFjgTuLWqSpD6s3hAZnpn/kpnbi8etwPAG9kuS1AbqDZHXIuKvIqJf8fgrYHMjOyZJ6v3qDZFLgQuB3wAbgC8CXmyXpD6u3im+1wLTMvN1gIgYCvwDlXCRJPVR9Y5E/rQ7QAAycwswvjFdkiS1i3pD5EMRcUj3RjESqXcUI0naT9UbBN8D/iMi7qay3MmFwPUN65UkqS3Ue8f6bRGxnMqiiwH8eWauamjPJEm9Xt2npIrQMDgkSTt5XUNSrzNn+o9b3YW9dsXcC1rdhZYo9XsikiSBISJJ2geGiCSptIaFSETMK5aOX1FVGxoRSyJibfF8SFGPiLgxItZFxHMRcVzVPtOK9msjYlpV/fiIeL7Y58aIiEZ9F0lSbY0cidwKTN6ldg3wUGaOBh4qtgHOAkYXj+nAHNh5U+NMYCIwAZhZddPjnKJt9367fpYkqcEaFiKZ+SiwZZfyucD84vV84Lyq+m1Z8QRwcEQcDkwClmTmlmLZlSXA5OK9AzPz8cxM4LaqY0mSmqTZ10Q+mpkbAIrnw4r6kcD6qnZdRW1P9a4a9ZoiYnpELI+I5Zs2bdrnLyFJqugtF9ZrXc/IEvWaMnNuZnZmZufw4f6WliT1lGaHyKvFqSiK541FvQsYUdWuA3jlA+odNeqSpCZqdogsBLpnWE0D7qmqX1zM0joBeLM43bUYODMiDikuqJ8JLC7e2xoRJxSzsi6uOpYkqUkatuxJRNwBnAocGhFdVGZZ/R1wV0RcBrwEdK8TsAiYAqwDtlH8amJmbomIa4FlRbvZxW+ZAFxBZQbYYOD+4iFJaqKGhUhmTn2ft06v0TaBGe9znHnAvBr15cCn9qWPkqR901surEuS2pAhIkkqzRCRJJVmiEiSSjNEJEmlGSKSpNIMEUlSaYaIJKk0Q0SSVJohIkkqzRCRJJVmiEiSSjNEJEmlGSKSpNIMEUlSaYaIJKk0Q0SSVJohIkkqzRCRJJVmiEiSSjNEJEmlGSKSpNIMEUlSaYaIJKk0Q0SSVJohIkkqzRCRJJVmiEiSSjNEJEmlGSKSpNIMEUlSaYaIJKk0Q0SSVFpLQiQifh0Rz0fEsxGxvKgNjYglEbG2eD6kqEdE3BgR6yLiuYg4ruo404r2ayNiWiu+iyT1Za0ciXwuM8dlZmexfQ3wUGaOBh4qtgHOAkYXj+nAHKiEDjATmAhMAGZ2B48kqTl60+msc4H5xev5wHlV9duy4gng4Ig4HJgELMnMLZn5OrAEmNzsTktSX9aqEEngwYh4KiKmF7WPZuYGgOL5sKJ+JLC+at+uovZ+9d1ExPSIWB4Ryzdt2tSDX0OS+rb+LfrckzLzlYg4DFgSEf+5h7ZRo5Z7qO9ezJwLzAXo7Oys2UaStPdaMhLJzFeK543Az6hc03i1OE1F8byxaN4FjKjavQN4ZQ91SVKTND1EIuIjETGk+zVwJrACWAh0z7CaBtxTvF4IXFzM0joBeLM43bUYODMiDikuqJ9Z1CRJTdKK01kfBX4WEd2f/6+Z+UBELAPuiojLgJeAC4r2i4ApwDpgG/BlgMzcEhHXAsuKdrMzc0tPdnTO9B/35OGa4oq5F3xwI0nqIU0Pkcx8ATi2Rn0zcHqNegIz3udY84B5Pd1HSVJ9etMUX0lSmzFEJEmlGSKSpNIMEUlSaYaIJKk0Q0SSVJohIkkqzRCRJJVmiEiSSjNEJEmlGSKSpNIMEUlSaYaIJKm0Vv2yoaRdTLn58VZ3Ya8tmnFiq7ugFnMkIkkqzRCRJJVmiEiSSjNEJEmlGSKSpNIMEUlSaYaIJKk07xORpB7Qbvf59NQ9Po5EJEmlGSKSpNIMEUlSaYaIJKk0Q0SSVJohIkkqzRCRJJVmiEiSSjNEJEmlGSKSpNJc9mQ/01eXXpDUGo5EJEmltX2IRMTkiFgTEesi4ppW90eS+pK2DpGI6AfcDJwFjAGmRsSY1vZKkvqOtg4RYAKwLjNfyMx3gQXAuS3ukyT1Ge1+Yf1IYH3VdhcwcddGETEdmF5svhURa5rQt1Y5FHit1Z2oV1zZ6h70Km31twP/frtoq79fib/dx2oV2z1EokYtdytkzgXmNr47rRcRyzOzs9X90N7zb9fe+urfr91PZ3UBI6q2O4BXWtQXSepz2j1ElgGjI2JURAwELgIWtrhPktRntPXprMzcHhFXAouBfsC8zFzZ4m61Wp84bbef8m/X3vrk3y8yd7uEIElSXdr9dJYkqYUMEUlSaYbIfsLlX9pXRMyLiI0RsaLVfdHeiYgREfFwRKyOiJURcVWr+9RsXhPZDxTLv/w/4H9Rmfa8DJiamata2jHVJSJOAd4CbsvMT7W6P6pfRBwOHJ6ZT0fEEOAp4Ly+9O+eI5H9g8u/tLHMfBTY0up+aO9l5obMfLp4vRVYTWUljT7DENk/1Fr+pU/9gyy1WkSMBMYDT7a2J81liOwf6lr+RVJjRMQBwE+Ar2fmb1vdn2YyRPYPLv8itUhEDKASILdn5k9b3Z9mM0T2Dy7/IrVARATwI2B1Zn6/1f1pBUNkP5CZ24Hu5V9WA3e5/Ev7iIg7gMeBYyKiKyIua3WfVLeTgC8Bp0XEs8VjSqs71UxO8ZUkleZIRJJUmiEiSSrNEJEklWaISJJKM0QkSaUZIlIvExFHRMTdxetx9UwZjYhTI+K+xvdO+kOGiNSLRET/zHwlM79YlMYBfeq+A7UXQ0TqARExMiL+MyJ+GBErIuL2iDgjIh6LiLURMaF4/EdEPFM8H1Pse0lE/Dgi7gUeLI61olh9YDbwF8VNbH/xfseQWqV/qzsg7Uc+AVwATKeyFM1fAp8FzgG+BVwMnJKZ2yPiDOBvgS8U+54I/GlmbilWgyUz342I7wCdmXklQEQcuIdjSE1niEg958XMfB4gIlYCD2VmRsTzwEjgIGB+RIymssrygKp9l2RmPb8psqdjSE3n6Syp5/yu6vV/V23/N5X/YbsWeLj49cKzgUFV7f+rzs/Y0zGkpjNEpOY5CHi5eH1JnftsBYbs4zGkhjFEpOb5P8D/jojHgH517vMwMKb7wnrJY0gN4yq+kqTSHIlIkkozRCRJpRkikqTSDBFJUmmGiCSpNENEklSaISJJKu1/AE+VLRojqJH5AAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Lets analyse marital-status-wise which category tend to apply more for term deposit respectively\n", "\n", "sns.countplot(x='marital', hue='Target',data=df_main)#\n", "\n", "print(columnEncoders['marital'].classes_)\n", "\n", "# Here we conclude that the order of applyterm deposit is in married > single > divorced" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
monthapraugdecfebjanjuljunmarmaynovoctsep
Target
no235555591142208126162684795229128413567415310
yes577688100441142627546248925403323269
\n", "
" ], "text/plain": [ "month apr aug dec feb jan jul jun mar may nov oct sep\n", "Target \n", "no 2355 5559 114 2208 1261 6268 4795 229 12841 3567 415 310\n", "yes 577 688 100 441 142 627 546 248 925 403 323 269" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.crosstab(df_original['Target'], df_original['month'])\n", "\n", "# We see here\n", "# - May has higher success and failure of Target values\n", "# - August has the second highest acceptanec value\n", "# But in terms of percentage acceptance august has higher value than may." ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Deposit by months visual\n", "sns.countplot(x='month', hue='Target',data=df_original)" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Checking if people who have given contants have got term deposit\n", "sns.countplot(x='contact', hue='Target',data=df_original)\n", "\n", "# This indicated who have registerd cellular contacts have slightly higer rate of applying for term deposit\n", "# This again tell us people who are working in higher job profile like management, technicians" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
agejobmaritaleducationdefaultbalancehousingloancontactdaymonthdurationcampaignpdayspreviouspoutcomeTarget
05841202143102584.351-1030
144921029102582.521-1030
23321102112581.271-1030
34711301506102581.531-1030
433112301002583.301-1030
\n", "
" ], "text/plain": [ " age job marital education default balance housing loan contact \\\n", "0 58 4 1 2 0 2143 1 0 2 \n", "1 44 9 2 1 0 29 1 0 2 \n", "2 33 2 1 1 0 2 1 1 2 \n", "3 47 1 1 3 0 1506 1 0 2 \n", "4 33 11 2 3 0 1 0 0 2 \n", "\n", " day month duration campaign pdays previous poutcome Target \n", "0 5 8 4.35 1 -1 0 3 0 \n", "1 5 8 2.52 1 -1 0 3 0 \n", "2 5 8 1.27 1 -1 0 3 0 \n", "3 5 8 1.53 1 -1 0 3 0 \n", "4 5 8 3.30 1 -1 0 3 0 " ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\n", "# Converting duration in dataset which is in seconds to minutes upto decimal ot 2 digits\n", "decimal_points = 2\n", "df_main['duration'] = df_main['duration'] / 60\n", "df_main['duration'] = df_main['duration'].apply(lambda x: round(x, decimal_points))\n", "\n", "df_main.head()" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [], "source": [ "# Balance colums seems to be dominating all other values lets scale it\n", "from sklearn.preprocessing import MinMaxScaler\n", "\n", "scaler = MinMaxScaler()\n", "\n", "df_main[['balance']] = scaler.fit_transform(df_main[['balance']])" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
agejobmaritaleducationdefaultbalancehousingloancontactdaymonthdurationcampaignpdayspreviouspoutcomeTarget
05841200.092259102584.351-1030
14492100.073067102582.521-1030
23321100.072822112581.271-1030
34711300.086476102581.531-1030
433112300.072812002583.301-1030
\n", "
" ], "text/plain": [ " age job marital education default balance housing loan contact \\\n", "0 58 4 1 2 0 0.092259 1 0 2 \n", "1 44 9 2 1 0 0.073067 1 0 2 \n", "2 33 2 1 1 0 0.072822 1 1 2 \n", "3 47 1 1 3 0 0.086476 1 0 2 \n", "4 33 11 2 3 0 0.072812 0 0 2 \n", "\n", " day month duration campaign pdays previous poutcome Target \n", "0 5 8 4.35 1 -1 0 3 0 \n", "1 5 8 2.52 1 -1 0 3 0 \n", "2 5 8 1.27 1 -1 0 3 0 \n", "3 5 8 1.53 1 -1 0 3 0 \n", "4 5 8 3.30 1 -1 0 3 0 " ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Looking closely at our columns we see some columns prefixed with char `p`, reading from the problem\n", "# statement we came to know that these field are some kind of indicated of previous analysis or campaign \n", "# like poutcome is not neccessarily should be part of train data model becuase this is not an attribute on input\n", "# but a conclusion on the previous analysis/campaign\n", "\n", "# So we can even build our data model removing these `p{x}` columns\n", "df_main.head()" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countmeanstdmin25%50%75%max
age45211.040.93621010.61876218.033.00000039.00000048.00000095.00
job45211.04.3397623.2726570.01.0000004.0000007.00000011.00
marital45211.01.1677250.6082300.01.0000001.0000002.0000002.00
education45211.01.2248130.7479970.01.0000001.0000002.0000003.00
default45211.00.0180270.1330490.00.0000000.0000000.0000001.00
balance45211.00.0851710.0276430.00.0734570.0768710.0857681.00
housing45211.00.5558380.4968780.00.0000001.0000001.0000001.00
loan45211.00.1602260.3668200.00.0000000.0000000.0000001.00
contact45211.00.6402420.8979510.00.0000000.0000002.0000002.00
day45211.015.8064198.3224761.08.00000016.00000021.00000031.00
month45211.05.5230143.0069110.03.0000006.0000008.00000011.00
duration45211.04.3027294.2921320.01.7200003.0000005.32000081.97
campaign45211.02.7638413.0980211.01.0000002.0000003.00000063.00
pdays45211.040.197828100.128746-1.0-1.000000-1.000000-1.000000871.00
previous45211.00.5803232.3034410.00.0000000.0000000.000000275.00
poutcome45211.02.5599740.9890590.03.0000003.0000003.0000003.00
Target45211.00.1169850.3214060.00.0000000.0000000.0000001.00
\n", "
" ], "text/plain": [ " count mean std min 25% 50% \\\n", "age 45211.0 40.936210 10.618762 18.0 33.000000 39.000000 \n", "job 45211.0 4.339762 3.272657 0.0 1.000000 4.000000 \n", "marital 45211.0 1.167725 0.608230 0.0 1.000000 1.000000 \n", "education 45211.0 1.224813 0.747997 0.0 1.000000 1.000000 \n", "default 45211.0 0.018027 0.133049 0.0 0.000000 0.000000 \n", "balance 45211.0 0.085171 0.027643 0.0 0.073457 0.076871 \n", "housing 45211.0 0.555838 0.496878 0.0 0.000000 1.000000 \n", "loan 45211.0 0.160226 0.366820 0.0 0.000000 0.000000 \n", "contact 45211.0 0.640242 0.897951 0.0 0.000000 0.000000 \n", "day 45211.0 15.806419 8.322476 1.0 8.000000 16.000000 \n", "month 45211.0 5.523014 3.006911 0.0 3.000000 6.000000 \n", "duration 45211.0 4.302729 4.292132 0.0 1.720000 3.000000 \n", "campaign 45211.0 2.763841 3.098021 1.0 1.000000 2.000000 \n", "pdays 45211.0 40.197828 100.128746 -1.0 -1.000000 -1.000000 \n", "previous 45211.0 0.580323 2.303441 0.0 0.000000 0.000000 \n", "poutcome 45211.0 2.559974 0.989059 0.0 3.000000 3.000000 \n", "Target 45211.0 0.116985 0.321406 0.0 0.000000 0.000000 \n", "\n", " 75% max \n", "age 48.000000 95.00 \n", "job 7.000000 11.00 \n", "marital 2.000000 2.00 \n", "education 2.000000 3.00 \n", "default 0.000000 1.00 \n", "balance 0.085768 1.00 \n", "housing 1.000000 1.00 \n", "loan 0.000000 1.00 \n", "contact 2.000000 2.00 \n", "day 21.000000 31.00 \n", "month 8.000000 11.00 \n", "duration 5.320000 81.97 \n", "campaign 3.000000 63.00 \n", "pdays -1.000000 871.00 \n", "previous 0.000000 275.00 \n", "poutcome 3.000000 3.00 \n", "Target 0.000000 1.00 " ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_main.describe().transpose()" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
agejobmaritaleducationdefaultbalancehousingloancontactdaymonthdurationcampaignpdayspreviouspoutcomeTarget
age1.000000-0.021868-0.403240-0.106807-0.0178790.097783-0.185513-0.0156550.026221-0.009120-0.042357-0.0046480.004760-0.0237580.0012880.0073670.025155
job-0.0218681.0000000.0620450.166707-0.0068530.018232-0.125363-0.033004-0.0820630.022856-0.0928700.0047470.006839-0.024455-0.0009110.0110100.040438
marital-0.4032400.0620451.0000000.108576-0.0070230.002122-0.016096-0.046893-0.039201-0.005261-0.0069910.011849-0.0089940.0191720.014973-0.0168500.045588
education-0.1068070.1667070.1085761.000000-0.0107180.064514-0.090790-0.048574-0.1109280.022671-0.0573040.0019360.0062550.0000520.017570-0.0193610.066241
default-0.017879-0.006853-0.007023-0.0107181.000000-0.066745-0.0060250.0772340.0154040.0094240.011486-0.0100210.016822-0.029979-0.0183290.034898-0.022419
balance0.0977830.0182320.0021220.064514-0.0667451.000000-0.068768-0.084350-0.0272730.0045030.0197770.021564-0.0145780.0034350.016674-0.0209670.052838
housing-0.185513-0.125363-0.016096-0.090790-0.006025-0.0687681.0000000.0413230.188123-0.0279820.2714810.005075-0.0235990.1241780.037076-0.099971-0.139173
loan-0.015655-0.033004-0.046893-0.0485740.077234-0.0843500.0413231.000000-0.0108730.0113700.022145-0.0124080.009980-0.022754-0.0110430.015458-0.068185
contact0.026221-0.082063-0.039201-0.1109280.015404-0.0272730.188123-0.0108731.000000-0.0279360.361145-0.0208380.019614-0.244816-0.1478110.272214-0.148395
day-0.0091200.022856-0.0052610.0226710.0094240.004503-0.0279820.011370-0.0279361.000000-0.006028-0.0302090.162490-0.093044-0.0517100.083460-0.028348
month-0.042357-0.092870-0.006991-0.0573040.0114860.0197770.2714810.0221450.361145-0.0060281.0000000.006311-0.1100310.0330650.022727-0.033038-0.024471
duration-0.0046480.0047470.0118490.001936-0.0100210.0215640.005075-0.012408-0.020838-0.0302090.0063111.000000-0.084569-0.0015690.0012050.0109260.394521
campaign0.0047600.006839-0.0089940.0062550.016822-0.014578-0.0235990.0099800.0196140.162490-0.110031-0.0845691.000000-0.088628-0.0328550.101588-0.073172
pdays-0.023758-0.0244550.0191720.000052-0.0299790.0034350.124178-0.022754-0.244816-0.0930440.033065-0.001569-0.0886281.0000000.454820-0.8583620.103621
previous0.001288-0.0009110.0149730.017570-0.0183290.0166740.037076-0.011043-0.147811-0.0517100.0227270.001205-0.0328550.4548201.000000-0.4897520.093236
poutcome0.0073670.011010-0.016850-0.0193610.034898-0.020967-0.0999710.0154580.2722140.083460-0.0330380.0109260.101588-0.858362-0.4897521.000000-0.077840
Target0.0251550.0404380.0455880.066241-0.0224190.052838-0.139173-0.068185-0.148395-0.028348-0.0244710.394521-0.0731720.1036210.093236-0.0778401.000000
\n", "
" ], "text/plain": [ " age job marital education default balance \\\n", "age 1.000000 -0.021868 -0.403240 -0.106807 -0.017879 0.097783 \n", "job -0.021868 1.000000 0.062045 0.166707 -0.006853 0.018232 \n", "marital -0.403240 0.062045 1.000000 0.108576 -0.007023 0.002122 \n", "education -0.106807 0.166707 0.108576 1.000000 -0.010718 0.064514 \n", "default -0.017879 -0.006853 -0.007023 -0.010718 1.000000 -0.066745 \n", "balance 0.097783 0.018232 0.002122 0.064514 -0.066745 1.000000 \n", "housing -0.185513 -0.125363 -0.016096 -0.090790 -0.006025 -0.068768 \n", "loan -0.015655 -0.033004 -0.046893 -0.048574 0.077234 -0.084350 \n", "contact 0.026221 -0.082063 -0.039201 -0.110928 0.015404 -0.027273 \n", "day -0.009120 0.022856 -0.005261 0.022671 0.009424 0.004503 \n", "month -0.042357 -0.092870 -0.006991 -0.057304 0.011486 0.019777 \n", "duration -0.004648 0.004747 0.011849 0.001936 -0.010021 0.021564 \n", "campaign 0.004760 0.006839 -0.008994 0.006255 0.016822 -0.014578 \n", "pdays -0.023758 -0.024455 0.019172 0.000052 -0.029979 0.003435 \n", "previous 0.001288 -0.000911 0.014973 0.017570 -0.018329 0.016674 \n", "poutcome 0.007367 0.011010 -0.016850 -0.019361 0.034898 -0.020967 \n", "Target 0.025155 0.040438 0.045588 0.066241 -0.022419 0.052838 \n", "\n", " housing loan contact day month duration \\\n", "age -0.185513 -0.015655 0.026221 -0.009120 -0.042357 -0.004648 \n", "job -0.125363 -0.033004 -0.082063 0.022856 -0.092870 0.004747 \n", "marital -0.016096 -0.046893 -0.039201 -0.005261 -0.006991 0.011849 \n", "education -0.090790 -0.048574 -0.110928 0.022671 -0.057304 0.001936 \n", "default -0.006025 0.077234 0.015404 0.009424 0.011486 -0.010021 \n", "balance -0.068768 -0.084350 -0.027273 0.004503 0.019777 0.021564 \n", "housing 1.000000 0.041323 0.188123 -0.027982 0.271481 0.005075 \n", "loan 0.041323 1.000000 -0.010873 0.011370 0.022145 -0.012408 \n", "contact 0.188123 -0.010873 1.000000 -0.027936 0.361145 -0.020838 \n", "day -0.027982 0.011370 -0.027936 1.000000 -0.006028 -0.030209 \n", "month 0.271481 0.022145 0.361145 -0.006028 1.000000 0.006311 \n", "duration 0.005075 -0.012408 -0.020838 -0.030209 0.006311 1.000000 \n", "campaign -0.023599 0.009980 0.019614 0.162490 -0.110031 -0.084569 \n", "pdays 0.124178 -0.022754 -0.244816 -0.093044 0.033065 -0.001569 \n", "previous 0.037076 -0.011043 -0.147811 -0.051710 0.022727 0.001205 \n", "poutcome -0.099971 0.015458 0.272214 0.083460 -0.033038 0.010926 \n", "Target -0.139173 -0.068185 -0.148395 -0.028348 -0.024471 0.394521 \n", "\n", " campaign pdays previous poutcome Target \n", "age 0.004760 -0.023758 0.001288 0.007367 0.025155 \n", "job 0.006839 -0.024455 -0.000911 0.011010 0.040438 \n", "marital -0.008994 0.019172 0.014973 -0.016850 0.045588 \n", "education 0.006255 0.000052 0.017570 -0.019361 0.066241 \n", "default 0.016822 -0.029979 -0.018329 0.034898 -0.022419 \n", "balance -0.014578 0.003435 0.016674 -0.020967 0.052838 \n", "housing -0.023599 0.124178 0.037076 -0.099971 -0.139173 \n", "loan 0.009980 -0.022754 -0.011043 0.015458 -0.068185 \n", "contact 0.019614 -0.244816 -0.147811 0.272214 -0.148395 \n", "day 0.162490 -0.093044 -0.051710 0.083460 -0.028348 \n", "month -0.110031 0.033065 0.022727 -0.033038 -0.024471 \n", "duration -0.084569 -0.001569 0.001205 0.010926 0.394521 \n", "campaign 1.000000 -0.088628 -0.032855 0.101588 -0.073172 \n", "pdays -0.088628 1.000000 0.454820 -0.858362 0.103621 \n", "previous -0.032855 0.454820 1.000000 -0.489752 0.093236 \n", "poutcome 0.101588 -0.858362 -0.489752 1.000000 -0.077840 \n", "Target -0.073172 0.103621 0.093236 -0.077840 1.000000 " ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_main.corr()" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "corr = df_main.corr()\n", "plt.figure(figsize = (12,12))\n", "\n", "sns.heatmap(corr, annot=True, linewidths=.5, cmap=\"YlGnBu\")#, cbar=False)\n", "\n", "# Here we can simply reduce poutcome, pdays, previous, campaig" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Training constants and general imports" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [], "source": [ "\n", "# Training constants and general imports\n", "\n", "from sklearn.tree import DecisionTreeClassifier\n", "from sklearn.ensemble import RandomForestClassifier\n", "from sklearn.ensemble import AdaBoostClassifier\n", "from sklearn.ensemble import BaggingClassifier\n", "from sklearn.ensemble import GradientBoostingClassifier\n", "\n", "\n", "\n", "from sklearn import metrics\n", "from sklearn.metrics import classification_report\n", "\n", "# taking 70:30 training and test set\n", "test_size = 0.30 \n", "\n", "# Random number seeding for reapeatability of the code\n", "seed = 2 # spirit and opportunity Mars exploration rovers\n", "\n", "\n", "def isqrt(n):\n", " x = n\n", " y = (x + 1) // 2\n", " while y < x:\n", " x = y\n", " y = (x + n // x) // 2\n", " return x" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data Preparation" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
agejobmaritaleducationdefaultbalancehousingloancontactdaymonthcampaignpdaysprevious
05841200.092259102581-10
14492100.073067102581-10
23321100.072822112581-10
34711300.086476102581-10
433112300.072812002581-10
\n", "
" ], "text/plain": [ " age job marital education default balance housing loan contact \\\n", "0 58 4 1 2 0 0.092259 1 0 2 \n", "1 44 9 2 1 0 0.073067 1 0 2 \n", "2 33 2 1 1 0 0.072822 1 1 2 \n", "3 47 1 1 3 0 0.086476 1 0 2 \n", "4 33 11 2 3 0 0.072812 0 0 2 \n", "\n", " day month campaign pdays previous \n", "0 5 8 1 -1 0 \n", "1 5 8 1 -1 0 \n", "2 5 8 1 -1 0 \n", "3 5 8 1 -1 0 \n", "4 5 8 1 -1 0 " ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "## Prepare input columns\n", "df_main_x = df_main.copy()\n", "\n", "# Colums we are dropping which are mostly related to previous campaign\n", "df_main_x = df_main_x.drop(['poutcome', 'duration', 'Target'], axis = 1) \n", "\n", "# df_main_x_ary = np.asarray(df_main_x)\n", "\n", "df_main_y = df_original['Target']\n", "\n", "\n", "df_main_x.head()\n" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 no\n", "1 no\n", "2 no\n", "3 no\n", "4 no\n", "Name: Target, dtype: object" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "## Target column seperate\n", "df_main_y.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Training" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [], "source": [ "\n", "from sklearn.model_selection import train_test_split\n", "\n", "X_train, X_test, y_train, y_test = \\\n", " train_test_split(np.asarray(df_main_x), np.asarray(df_main_y), test_size=test_size, random_state=seed) \n" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [], "source": [ "# Holder class for data from different classifiers\n", "class EnsembleTechnique:\n", " def __init__(self, score, prediction, accuracy, confusion_matrix, classification_report, n_estimators):\n", " self.score = score\n", " self.prediction = prediction\n", " self.accuracy = accuracy\n", " self.confusion_matrix = confusion_matrix\n", " self.classification_report = classification_report\n", " self.n_estimators = n_estimators" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Total rows 45211\n", "Limit till we find Nestimators 212\n" ] } ], "source": [ "rows = df_main_x.shape[0]\n", "print(\"Total rows {}\".format(rows))\n", "maxLimit = isqrt(rows)\n", "print(\"Limit till we find Nestimators {}\".format(maxLimit))\n", "\n", "# Result map to hold name and score for each model\n", "results = {}\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Decision Tree using `entropy model`" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Prediction: ['no' 'no' 'no' ... 'no' 'no' 'no']\n", "Score: 0.8261574756708936\n", "Accuracy 0.8261574756708936\n", "Confusion metrix\n", "[[10738 1261]\n", " [ 1097 468]]\n", " precision recall f1-score support\n", "\n", " no 0.91 0.89 0.90 11999\n", " yes 0.27 0.30 0.28 1565\n", "\n", " accuracy 0.83 13564\n", " macro avg 0.59 0.60 0.59 13564\n", "weighted avg 0.83 0.83 0.83 13564\n", "\n" ] } ], "source": [ "#Init\n", "decisionTreeClassifier = DecisionTreeClassifier(criterion = 'entropy')\n", "\n", "#fit data\n", "decisionTreeClassifier.fit(X_train, y_train)\n", "\n", "#Predict\n", "dtc_y_pred = decisionTreeClassifier.predict(X_test)\n", "\n", "# Model score\n", "dtc_model_score = decisionTreeClassifier.score(X_test , y_test)\n", "\n", "# Accuracy\n", "dtc_model_accuracy = metrics.accuracy_score(y_test, dtc_y_pred)\n", "\n", "\n", "print(\"Prediction: {}\".format(dtc_y_pred))\n", "print(\"Score: {}\".format(dtc_model_score))\n", "print(\"Accuracy {}\".format(dtc_model_accuracy))\n", "print(\"Confusion metrix\")\n", "print(metrics.confusion_matrix(y_test, dtc_y_pred))\n", "print(classification_report(y_test,dtc_y_pred))\n", "\n", "results['Decision Tree'] = dtc_model_score" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Random Forest Classifier" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Prediction: ['no' 'no' 'no' ... 'no' 'no' 'no']\n", "Score: 0.8882335594219994\n", "Accuracy 0.8882335594219994\n", "n estimators : 128\n", "Confusion metrix\n", "[[11775 224]\n", " [ 1292 273]]\n", " precision recall f1-score support\n", "\n", " no 0.90 0.98 0.94 11999\n", " yes 0.55 0.17 0.26 1565\n", "\n", " accuracy 0.89 13564\n", " macro avg 0.73 0.58 0.60 13564\n", "weighted avg 0.86 0.89 0.86 13564\n", "\n" ] } ], "source": [ "\n", "# determining n_estimators here, we should proceed with approach 2^n\n", "# before stopping at the best outcome we should compare the result of previous outcome\n", "previous = EnsembleTechnique(0.0, 0.0, 0.0, None, None, 0)\n", "\n", "counter = 1;\n", "estimator = 0\n", "while(estimator previous.score:\n", " previous = EnsembleTechnique(rfc_model_score, rfc_y_pred, rfc_model_accuracy,\n", " metrics.confusion_matrix(y_test, rfc_y_pred),\n", " classification_report(y_test,rfc_y_pred), \n", " estimator)\n", " \n", " \n", " \n", " \n", "\n", "print(\"Prediction: {}\".format(previous.prediction))\n", "print(\"Score: {}\".format(previous.score))\n", "print(\"Accuracy {}\".format(previous.accuracy))\n", "print(\"n estimators : {}\".format(previous.n_estimators))\n", "print(\"Confusion metrix\")\n", "print(previous.confusion_matrix)\n", "print(previous.classification_report)\n", "\n", "results['Random Forest'] = previous.score" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Adaboost Classifier" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Prediction: ['no' 'no' 'no' ... 'no' 'no' 'no']\n", "Score: 0.8882335594219994\n", "Accuracy 0.8882335594219994\n", "n estimators : 256\n", "Confusion metrix\n", "[[11847 152]\n", " [ 1364 201]]\n", " precision recall f1-score support\n", "\n", " no 0.90 0.99 0.94 11999\n", " yes 0.57 0.13 0.21 1565\n", "\n", " accuracy 0.89 13564\n", " macro avg 0.73 0.56 0.57 13564\n", "weighted avg 0.86 0.89 0.86 13564\n", "\n" ] } ], "source": [ "\n", "previous = EnsembleTechnique(0.0, 0.0, 0.0, None, None, 0)\n", "\n", "counter = 1;\n", "estimator = 0\n", "while(estimator previous.score:\n", " previous = EnsembleTechnique(abc_model_score, abc_y_pred, abc_model_accuracy,\n", " metrics.confusion_matrix(y_test, abc_y_pred),\n", " classification_report(y_test, abc_y_pred), \n", " estimator)\n", " \n", " \n", " \n", " \n", "\n", "print(\"Prediction: {}\".format(previous.prediction))\n", "print(\"Score: {}\".format(previous.score))\n", "print(\"Accuracy {}\".format(previous.accuracy))\n", "print(\"n estimators : {}\".format(previous.n_estimators))\n", "print(\"Confusion metrix\")\n", "print(previous.confusion_matrix)\n", "print(previous.classification_report)\n", "\n", "results['Adaboost Classifier'] = previous.score" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Bagging Classifier" ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.8773223237982896 for n estimator 2\n", "0.8773960483633146 for n estimator 4\n", "0.8824093187850192 for n estimator 8\n", "0.8850634031259216 for n estimator 16\n", "0.8866116189914479 for n estimator 32\n", "0.8859480979062223 for n estimator 64\n", "0.8872014155116484 for n estimator 128\n", "0.8858006487761723 for n estimator 256\n", "Prediction: ['no' 'no' 'no' ... 'no' 'no' 'no']\n", "Score: 0.8872014155116484\n", "Accuracy 0.8872014155116484\n", "n estimators : 128\n", "Confusion metrix\n", "[[11684 315]\n", " [ 1215 350]]\n", " precision recall f1-score support\n", "\n", " no 0.91 0.97 0.94 11999\n", " yes 0.53 0.22 0.31 1565\n", "\n", " accuracy 0.89 13564\n", " macro avg 0.72 0.60 0.63 13564\n", "weighted avg 0.86 0.89 0.87 13564\n", "\n" ] } ], "source": [ "\n", "\n", "previous = EnsembleTechnique(0.0, 0.0, 0.0, None, None, 0)\n", "\n", "counter = 1;\n", "estimator = 0\n", "while(estimator previous.score:\n", " previous = EnsembleTechnique(bc_model_score, bc_y_pred, bc_model_accuracy,\n", " metrics.confusion_matrix(y_test, bc_y_pred),\n", " classification_report(y_test, bc_y_pred), \n", " estimator) \n", "\n", " \n", " \n", " \n", "results['Bagging Classifier'] = previous.score \n", "print(\"Prediction: {}\".format(previous.prediction))\n", "print(\"Score: {}\".format(previous.score))\n", "print(\"Accuracy {}\".format(previous.accuracy))\n", "print(\"n estimators : {}\".format(previous.n_estimators))\n", "print(\"Confusion metrix\")\n", "print(previous.confusion_matrix)\n", "print(previous.classification_report)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Gradient Boost Classifier" ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/home/ashish/installed_apps/anaconda3/lib/python3.7/site-packages/sklearn/metrics/_classification.py:1272: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.\n", " _warn_prf(average, modifier, msg_start, len(result))\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Prediction: ['no' 'no' 'no' ... 'no' 'no' 'no']\n", "Score: 0.8904452963727514\n", "Accuracy 0.8904452963727514\n", "n estimators : 256\n", "Confusion metrix\n", "[[11804 195]\n", " [ 1291 274]]\n", " precision recall f1-score support\n", "\n", " no 0.90 0.98 0.94 11999\n", " yes 0.58 0.18 0.27 1565\n", "\n", " accuracy 0.89 13564\n", " macro avg 0.74 0.58 0.61 13564\n", "weighted avg 0.86 0.89 0.86 13564\n", "\n" ] } ], "source": [ "\n", "previous = EnsembleTechnique(0.0, 0.0, 0.0, None, None, 0)\n", "\n", "counter = 1;\n", "estimator = 0\n", "while(estimator previous.score:\n", " previous = EnsembleTechnique(gb_model_score, gb_y_pred, gb_model_accuracy,\n", " metrics.confusion_matrix(y_test,gb_y_pred),\n", " classification_report(y_test, gb_y_pred), \n", " estimator) \n", "\n", "\n", "results['Gradient Boost Classifier'] = previous.score \n", "print(\"Prediction: {}\".format(previous.prediction))\n", "print(\"Score: {}\".format(previous.score))\n", "print(\"Accuracy {}\".format(previous.accuracy))\n", "print(\"n estimators : {}\".format(previous.n_estimators))\n", "print(\"Confusion metrix\")\n", "print(previous.confusion_matrix)\n", "print(previous.classification_report)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Analysis Result" ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Model score are \n", "{'Decision Tree': 0.8261574756708936, 'Random Forest': 0.8882335594219994, 'Adaboost Classifier': 0.8882335594219994, 'Bagging Classifier': 0.8872014155116484, 'Gradient Boost Classifier': 0.8904452963727514}\n" ] }, { "data": { "text/markdown": [ "**Gradient Boost Classifier** : has best score with accuracy **0.8904452963727514** " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "\n", "print(\"Model score are \")\n", "print(results)\n", "\n", "best_score = max(results, key=results.get);\n", "\n", "resultString = \" has best score with accuracy **{}** \".format(results[best_score])\n", "\n", "printTextAsMarkdown(best_score, resultString, color=\"blue\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Analysis Report" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Recall: Is the total number of \"Yes\" in the label column of the dataset. So how many \"Yes\" labels does our model detect.\n", "\n", "Precision: Means how sure is the prediction of our model that the actual label is a \"Yes\".\n", "\n", "Decision tree will not yield the best result as it is based on all the individual attributes where as Random Forest would random pic the colums and would aggregate result. **Random forest** would always perform best in accuracy. But **Gradient Boost Classifier** accuracy is incremental. Each new tree would be better than the previous one. In terms of performance, Random Forest beats Gradient Boost CLassifier due to parallel nature of execution where in Gradient Boost work sequentially.\n", "\n", "For the analysis it is clear that Gradient Boost Classifier give the best model score. We have also seen that number of trees also should be in certain range too less or many would not yield proper result." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.7" } }, "nbformat": 4, "nbformat_minor": 2 }