{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "XKeucFccRlfL"
   },
   "source": [
    "# Acknowledgment\n",
    "\n",
    "Ce notebook utilise une partie d'un notebook de [Sebastian Raschka](sebastianraschka.com)  Copyright (c) 2015, 2016 \n",
    "\n",
    "Python Machine Learning - Code Examples\n",
    "\n",
    "https://github.com/rasbt/python-machine-learning-book\n",
    "\n",
    "[MIT License](https://github.com/rasbt/python-machine-learning-book/blob/master/LICENSE.txt)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "0JNmXbVgRlfO"
   },
   "source": [
    "# Bref panorama de méthodes de classification"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "dI8XGlkWRlfP"
   },
   "source": [
    "### Table des matières"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "5ZJv1FEaRlfS"
   },
   "source": [
    "- [Initialisation et Chargement des Iris](#Initialisation-et-Chargement-des-Iris)\n",
    "- [Le Perceptron](#Le-Perceptron)\n",
    "- [Les K plus proches voisins](#K-plus-proches-voisins)\n",
    "- [La Régression logistique](#Régression-logistique)\n",
    "- [Machines à vecteurs de support](#Support-vector-machines)\n",
    "- [Le kernel trick](#Le-Kernel-Trick)\n",
    "- [Travail à réaliser](#Travail-à-réaliser)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "MProF21-RlfV"
   },
   "source": [
    "# Initialisation et Chargement des Iris\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "collapsed": true,
    "id": "Gv1Eg5zqRlfX"
   },
   "outputs": [],
   "source": [
    "from IPython.display import Image\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 34
    },
    "colab_type": "code",
    "collapsed": true,
    "executionInfo": {
     "elapsed": 519,
     "status": "ok",
     "timestamp": 1537540797380,
     "user": {
      "displayName": "ronan sicre",
      "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
      "userId": "117556902235679898709"
     },
     "user_tz": -120
    },
    "id": "N_XTIISHRlfd",
    "outputId": "9b639aea-4f9c-4cc7-f323-31e691ec5165"
   },
   "outputs": [],
   "source": [
    "import sys\n",
    "print(sys.version_info)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "collapsed": true,
    "id": "5L0kwLHkRlfn"
   },
   "outputs": [],
   "source": [
    "# Added version check for recent scikit-learn 0.18 checks\n",
    "from distutils.version import LooseVersion as Version\n",
    "from sklearn import __version__ as sklearn_version"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "collapsed": true,
    "id": "OM78Y1V6Rlfr"
   },
   "outputs": [],
   "source": [
    "import sklearn\n",
    "from sklearn import datasets\n",
    "iris = datasets.load_iris()\n",
    "iris\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "6SH9OUzURlfy"
   },
   "source": [
    "Loading the Iris dataset from scikit-learn. Here, the third column represents the petal length, and the fourth column the petal width of the flower samples. The classes are already converted to integer labels where 0=Iris-Setosa, 1=Iris-Versicolor, 2=Iris-Virginica."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 34
    },
    "colab_type": "code",
    "collapsed": true,
    "executionInfo": {
     "elapsed": 552,
     "status": "ok",
     "timestamp": 1537540831772,
     "user": {
      "displayName": "ronan sicre",
      "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
      "userId": "117556902235679898709"
     },
     "user_tz": -120
    },
    "id": "oPmvn2LORlf0",
    "outputId": "c5e59622-0818-49e2-e262-29c81c7e73a3"
   },
   "outputs": [],
   "source": [
    "from sklearn import datasets\n",
    "import numpy as np\n",
    "\n",
    "iris = datasets.load_iris()\n",
    "X = iris.data[:, [2, 3]]\n",
    "y = iris.target\n",
    "\n",
    "print('Class labels:', np.unique(y))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "g9QF7oESRlf9"
   },
   "source": [
    "Splitting data into 70% training and 30% test data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "collapsed": true,
    "id": "q1v-SYIGRlf-"
   },
   "outputs": [],
   "source": [
    "from sklearn.model_selection import train_test_split\n",
    "\n",
    "X_train, X_test, y_train, y_test = train_test_split(\n",
    "    X, y, test_size=0.3, random_state=0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "Ij8D-Rh8RlgD"
   },
   "source": [
    "Standardizing the features:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "collapsed": true,
    "id": "a-6u4S5iRlgH"
   },
   "outputs": [],
   "source": [
    "from sklearn.preprocessing import StandardScaler\n",
    "\n",
    "sc = StandardScaler()\n",
    "sc.fit(X_train)\n",
    "X_train_std = sc.transform(X_train)\n",
    "X_test_std = sc.transform(X_test)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "_TDFa43KRlgP"
   },
   "source": [
    "# Le Perceptron"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "CouU51obRlgQ"
   },
   "source": [
    "Redefining the `plot_decision_region` function:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 69
    },
    "colab_type": "code",
    "collapsed": true,
    "executionInfo": {
     "elapsed": 536,
     "status": "ok",
     "timestamp": 1537540837036,
     "user": {
      "displayName": "ronan sicre",
      "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
      "userId": "117556902235679898709"
     },
     "user_tz": -120
    },
    "id": "c6KxTRtRRlgR",
    "outputId": "bb8b7d1d-75c9-4b64-e1cb-76216a4322f4"
   },
   "outputs": [],
   "source": [
    "from sklearn.linear_model import Perceptron\n",
    "\n",
    "ppn = Perceptron(max_iter=40, eta0=0.1, random_state=0)\n",
    "ppn.fit(X_train_std, y_train)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 34
    },
    "colab_type": "code",
    "collapsed": true,
    "executionInfo": {
     "elapsed": 536,
     "status": "ok",
     "timestamp": 1537540838947,
     "user": {
      "displayName": "ronan sicre",
      "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
      "userId": "117556902235679898709"
     },
     "user_tz": -120
    },
    "id": "sw-BEh6YRlgY",
    "outputId": "00c8861e-ce31-4176-db47-3659d9f64c5c"
   },
   "outputs": [],
   "source": [
    "y_test.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 34
    },
    "colab_type": "code",
    "collapsed": true,
    "executionInfo": {
     "elapsed": 536,
     "status": "ok",
     "timestamp": 1537540840419,
     "user": {
      "displayName": "ronan sicre",
      "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
      "userId": "117556902235679898709"
     },
     "user_tz": -120
    },
    "id": "Pg019giQRlgd",
    "outputId": "3dc0bb2b-5e50-4a43-c5b3-e5f3d3bcbfac"
   },
   "outputs": [],
   "source": [
    "y_pred = ppn.predict(X_test_std)\n",
    "print('Misclassified samples: %d' % (y_test != y_pred).sum())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 52
    },
    "colab_type": "code",
    "collapsed": true,
    "executionInfo": {
     "elapsed": 499,
     "status": "ok",
     "timestamp": 1537540848068,
     "user": {
      "displayName": "ronan sicre",
      "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
      "userId": "117556902235679898709"
     },
     "user_tz": -120
    },
    "id": "5f9etrh-Rlgk",
    "outputId": "95aeb6fd-8065-4c99-eab9-9cf743c078a2"
   },
   "outputs": [],
   "source": [
    "from sklearn.metrics import accuracy_score\n",
    "\n",
    "print('Accuracy: %.2f' % accuracy_score(y_test, y_pred))\n",
    "\n",
    "\n",
    "print(ppn.score(X_test_std,y_test ))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "collapsed": true,
    "id": "MWXUSidlRlgq"
   },
   "outputs": [],
   "source": [
    "from matplotlib.colors import ListedColormap\n",
    "import matplotlib.pyplot as plt\n",
    "import warnings\n",
    "\n",
    "\n",
    "def versiontuple(v):\n",
    "    return tuple(map(int, (v.split(\".\"))))\n",
    "\n",
    "\n",
    "def plot_decision_regions(X, y, classifier, test_idx=None, resolution=0.02):\n",
    "\n",
    "    # setup marker generator and color map\n",
    "    markers = ('s', 'x', 'o', '^', 'v')\n",
    "    colors = ('red', 'blue', 'lightgreen', 'gray', 'cyan')\n",
    "    cmap = ListedColormap(colors[:len(np.unique(y))])\n",
    "\n",
    "    # plot the decision surface\n",
    "    x1_min, x1_max = X[:, 0].min() - 1, X[:, 0].max() + 1\n",
    "    x2_min, x2_max = X[:, 1].min() - 1, X[:, 1].max() + 1\n",
    "    xx1, xx2 = np.meshgrid(np.arange(x1_min, x1_max, resolution),\n",
    "                           np.arange(x2_min, x2_max, resolution))\n",
    "    Z = classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T)\n",
    "    Z = Z.reshape(xx1.shape)\n",
    "    plt.contourf(xx1, xx2, Z, alpha=0.4, cmap=cmap)\n",
    "    plt.xlim(xx1.min(), xx1.max())\n",
    "    plt.ylim(xx2.min(), xx2.max())\n",
    "\n",
    "    for idx, cl in enumerate(np.unique(y)):\n",
    "        plt.scatter(x=X[y == cl, 0], y=X[y == cl, 1],\n",
    "                    alpha=0.8, c=cmap(idx),\n",
    "                    marker=markers[idx], label=cl)\n",
    "\n",
    "    # highlight test samples\n",
    "    if test_idx:\n",
    "        # plot all samples\n",
    "        if not versiontuple(np.__version__) >= versiontuple('1.9.0'):\n",
    "            X_test, y_test = X[list(test_idx), :], y[list(test_idx)]\n",
    "            warnings.warn('Please update to NumPy 1.9.0 or newer')\n",
    "        else:\n",
    "            X_test, y_test = X[test_idx, :], y[test_idx]\n",
    "\n",
    "        plt.scatter(X_test[:, 0],\n",
    "                    X_test[:, 1],\n",
    "                    c='',\n",
    "                    alpha=1.0,\n",
    "                    linewidths=1,\n",
    "                    marker='o',\n",
    "                    s=55, label='test set')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "WACiLdTRRlgv"
   },
   "source": [
    "Training a perceptron model using the standardized training data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 297
    },
    "colab_type": "code",
    "collapsed": true,
    "executionInfo": {
     "elapsed": 736,
     "status": "ok",
     "timestamp": 1537540854876,
     "user": {
      "displayName": "ronan sicre",
      "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
      "userId": "117556902235679898709"
     },
     "user_tz": -120
    },
    "id": "bBZ6mYEfRlgx",
    "outputId": "d0568a66-96f7-488c-ebe0-153464b1af22"
   },
   "outputs": [],
   "source": [
    "X_combined_std = np.vstack((X_train_std, X_test_std))\n",
    "y_combined = np.hstack((y_train, y_test))\n",
    "\n",
    "plot_decision_regions(X=X_combined_std, y=y_combined,\n",
    "                      classifier=ppn, test_idx=range(105, 150))\n",
    "plt.xlabel('petal length [standardized]')\n",
    "plt.ylabel('petal width [standardized]')\n",
    "plt.legend(loc='upper left')\n",
    "\n",
    "plt.tight_layout()\n",
    "# plt.savefig('./figures/iris_perceptron_scikit.png', dpi=300)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "p29LV0lXRlg4"
   },
   "source": [
    "# K plus proches voisins"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 314
    },
    "colab_type": "code",
    "collapsed": true,
    "executionInfo": {
     "elapsed": 767,
     "status": "ok",
     "timestamp": 1537540866091,
     "user": {
      "displayName": "ronan sicre",
      "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
      "userId": "117556902235679898709"
     },
     "user_tz": -120
    },
    "id": "wN-eI-lzRlg6",
    "outputId": "9ffb363b-3cc1-427f-86ec-6f013d8f9964"
   },
   "outputs": [],
   "source": [
    "from sklearn.neighbors import KNeighborsClassifier\n",
    "\n",
    "knn = KNeighborsClassifier(n_neighbors=1, p=2, metric='minkowski')\n",
    "knn.fit(X_train_std, y_train)\n",
    "\n",
    "print(knn.score(X_test_std, y_test))\n",
    "plot_decision_regions(X_combined_std, y_combined, \n",
    "                      classifier=knn, test_idx=range(105, 150))\n",
    "\n",
    "plt.xlabel('petal length [standardized]')\n",
    "plt.ylabel('petal width [standardized]')\n",
    "plt.legend(loc='upper left')\n",
    "plt.tight_layout()\n",
    "# plt.savefig('./figures/k_nearest_neighbors.png', dpi=300)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "4e_ZiPPzRlhN"
   },
   "source": [
    "#  Régression logistique"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 314
    },
    "colab_type": "code",
    "collapsed": true,
    "executionInfo": {
     "elapsed": 571,
     "status": "ok",
     "timestamp": 1537540869780,
     "user": {
      "displayName": "ronan sicre",
      "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
      "userId": "117556902235679898709"
     },
     "user_tz": -120
    },
    "id": "CxpnRLY5RlhO",
    "outputId": "cd0178c7-737a-4170-b07b-1559ad0b7d93"
   },
   "outputs": [],
   "source": [
    "from sklearn.linear_model import LogisticRegression\n",
    "\n",
    "lr = LogisticRegression(C=1000.0, random_state=0)\n",
    "lr.fit(X_train_std, y_train)\n",
    "\n",
    "\n",
    "print('Accuracy: %.2f' % lr.score(X_test_std,y_test ))\n",
    "\n",
    "\n",
    "plot_decision_regions(X_combined_std, y_combined,\n",
    "                      classifier=lr, test_idx=range(105, 150))\n",
    "plt.xlabel('petal length [standardized]')\n",
    "plt.ylabel('petal width [standardized]')\n",
    "plt.legend(loc='upper left')\n",
    "plt.tight_layout()\n",
    "# plt.savefig('./figures/logistic_regression.png', dpi=300)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 34
    },
    "colab_type": "code",
    "collapsed": true,
    "executionInfo": {
     "elapsed": 548,
     "status": "ok",
     "timestamp": 1537540872646,
     "user": {
      "displayName": "ronan sicre",
      "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
      "userId": "117556902235679898709"
     },
     "user_tz": -120
    },
    "id": "Dw6kaXrgRlhX",
    "outputId": "51935b9f-75dc-4daf-b92b-2e4e7c5ad4b3"
   },
   "outputs": [],
   "source": [
    "print(lr.predict_proba(X_test_std[0, :].reshape(1, -1)))\n",
    "\n",
    "#print(lr.coef_)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "I_31t6EIRlhb"
   },
   "source": [
    "# Support vector machines"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "iJl7Xc6JRlhc"
   },
   "source": [
    "## Le cas non linéairement séparable et les slack variables"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 314
    },
    "colab_type": "code",
    "collapsed": true,
    "executionInfo": {
     "elapsed": 801,
     "status": "ok",
     "timestamp": 1537540885995,
     "user": {
      "displayName": "ronan sicre",
      "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
      "userId": "117556902235679898709"
     },
     "user_tz": -120
    },
    "id": "sfhxC2EzRlhd",
    "outputId": "c4ed3a68-eaa7-49b4-ea94-d4f5d4df3c8a"
   },
   "outputs": [],
   "source": [
    "from sklearn.svm import SVC\n",
    "\n",
    "svm = SVC(kernel='linear', C=1.0, random_state=0)\n",
    "svm.fit(X_train_std, y_train)\n",
    "\n",
    "\n",
    "print('Accuracy: %.2f' % svm.score(X_test_std,y_test ))\n",
    "\n",
    "\n",
    "plot_decision_regions(X_combined_std, y_combined,\n",
    "                      classifier=svm, test_idx=range(105, 150))\n",
    "plt.xlabel('petal length [standardized]')\n",
    "plt.ylabel('petal width [standardized]')\n",
    "plt.legend(loc='upper left')\n",
    "plt.tight_layout()\n",
    "# plt.savefig('./figures/support_vector_machine_linear.png', dpi=300)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "noJLgNHmRlhg"
   },
   "source": [
    "## Le kernel trick\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 297
    },
    "colab_type": "code",
    "collapsed": true,
    "executionInfo": {
     "elapsed": 721,
     "status": "ok",
     "timestamp": 1537540889890,
     "user": {
      "displayName": "ronan sicre",
      "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
      "userId": "117556902235679898709"
     },
     "user_tz": -120
    },
    "id": "KkjMzQ3bRlhh",
    "outputId": "9a32e167-52a2-46f0-eac8-2435988fa12a"
   },
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "\n",
    "np.random.seed(0)\n",
    "X_xor = np.random.randn(200, 2)\n",
    "y_xor = np.logical_xor(X_xor[:, 0] > 0,\n",
    "                       X_xor[:, 1] > 0)\n",
    "y_xor = np.where(y_xor, 1, -1)\n",
    "\n",
    "plt.scatter(X_xor[y_xor == 1, 0],\n",
    "            X_xor[y_xor == 1, 1],\n",
    "            c='b', marker='x',\n",
    "            label='1')\n",
    "plt.scatter(X_xor[y_xor == -1, 0],\n",
    "            X_xor[y_xor == -1, 1],\n",
    "            c='r',\n",
    "            marker='s',\n",
    "            label='-1')\n",
    "\n",
    "plt.xlim([-3, 3])\n",
    "plt.ylim([-3, 3])\n",
    "plt.legend(loc='best')\n",
    "plt.tight_layout()\n",
    "# plt.savefig('./figures/xor.png', dpi=300)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "X7zA-PVGRlhn"
   },
   "source": [
    "## Illustration sur le Ou Exclusif (XOR)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 297
    },
    "colab_type": "code",
    "collapsed": true,
    "executionInfo": {
     "elapsed": 1110,
     "status": "ok",
     "timestamp": 1537540893779,
     "user": {
      "displayName": "ronan sicre",
      "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
      "userId": "117556902235679898709"
     },
     "user_tz": -120
    },
    "id": "Sg9HIqq4Rlho",
    "outputId": "541c49ca-50ad-4567-edab-cd454896c205"
   },
   "outputs": [],
   "source": [
    "svm = SVC(kernel='rbf', random_state=0, gamma=0.10, C=10.0)\n",
    "svm.fit(X_xor, y_xor)\n",
    "\n",
    "\n",
    "plot_decision_regions(X_xor, y_xor,\n",
    "                      classifier=svm)\n",
    "\n",
    "plt.legend(loc='upper left')\n",
    "plt.tight_layout()\n",
    "# plt.savefig('./figures/support_vector_machine_rbf_xor.png', dpi=300)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 314
    },
    "colab_type": "code",
    "collapsed": true,
    "executionInfo": {
     "elapsed": 916,
     "status": "ok",
     "timestamp": 1537540897108,
     "user": {
      "displayName": "ronan sicre",
      "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
      "userId": "117556902235679898709"
     },
     "user_tz": -120
    },
    "id": "NEGtMJGqRlhq",
    "outputId": "56f35a94-ef79-4091-f7cd-5bac70a6669a"
   },
   "outputs": [],
   "source": [
    "from sklearn.svm import SVC\n",
    "\n",
    "svm = SVC(kernel='rbf', random_state=0, gamma=0.2, C=1.0)\n",
    "svm.fit(X_train_std, y_train)\n",
    "\n",
    "\n",
    "print('Accuracy: %.2f' % svm.score(X_test_std,y_test ))\n",
    "\n",
    "\n",
    "plot_decision_regions(X_combined_std, y_combined,\n",
    "                      classifier=svm, test_idx=range(105, 150))\n",
    "plt.xlabel('petal length [standardized]')\n",
    "plt.ylabel('petal width [standardized]')\n",
    "plt.legend(loc='upper left')\n",
    "plt.tight_layout()\n",
    "# plt.savefig('./figures/support_vector_machine_rbf_iris_1.png', dpi=300)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 314
    },
    "colab_type": "code",
    "collapsed": true,
    "executionInfo": {
     "elapsed": 895,
     "status": "ok",
     "timestamp": 1537540900630,
     "user": {
      "displayName": "ronan sicre",
      "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
      "userId": "117556902235679898709"
     },
     "user_tz": -120
    },
    "id": "WjVr5hTnRlhy",
    "outputId": "3b443786-5735-4a08-c027-35cd409d77b1"
   },
   "outputs": [],
   "source": [
    "svm = SVC(kernel='rbf', random_state=0, gamma=100.0, C=1.0)\n",
    "svm.fit(X_train_std, y_train)\n",
    "\n",
    "\n",
    "print('Accuracy: %.2f' % svm.score(X_test_std,y_test ))\n",
    "\n",
    "plot_decision_regions(X_combined_std, y_combined, \n",
    "                      classifier=svm, test_idx=range(105, 150))\n",
    "plt.xlabel('petal length [standardized]')\n",
    "plt.ylabel('petal width [standardized]')\n",
    "plt.legend(loc='upper left')\n",
    "plt.tight_layout()\n",
    "# plt.savefig('./figures/support_vector_machine_rbf_iris_2.png', dpi=300)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "z7WBQPkCRlh_"
   },
   "source": [
    "# Travail à réaliser\n",
    "\n",
    "Reproduisez pour les datasets suivants:\n",
    "- [Iris](http://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_iris.html#sklearn.datasets.load_iris)\n",
    "- [Digits](http://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html#sklearn.datasets.load_digits) (en utilisant les données complètes)\n",
    "- [Breast](http://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_breast_cancer.html)\n",
    "\n",
    "les expérimentations suivantes:\n",
    "\n",
    "- Mise au point de plusieurs types de classifieurs (Perceptron, régression logistique, SVM, Knn). Pour chacun de ces types de classifieurs vous devrez :\n",
    " - Définir les hyper-paramètres à faire varier.\n",
    " - Evaluer et selectionner par Grid-Search l'ensemble des configurations possibles, en utilisant la Validation Croisée à 3 plis pour l'évaluation de la performance en généralisation. Vous pourrez vous inspirer d'un code tel que [celui-ci](http://scikit-learn.org/stable/auto_examples/classification/plot_classifier_comparison.html#sphx-glr-auto-examples-classification-plot-classifier-comparison-py) pour boucler sur les datasets et/ou les classifieurs.\n",
    "- ### Ecrire sous forme d'un tableau récapitulatif les performances respectives (les meilleures obtenues) par chacun des modèles sur chacun des jeux de données (sur le test set).\n",
    "- Donner des conclusions sur les résultats obtenus quant à la performance, la stabilité, la robustesse des familles de classifieurs utilisées, et les paramètres optimaux de chaque type de modèle.\n",
    "\n",
    "\n",
    "  \n",
    " \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 69
    },
    "colab_type": "code",
    "collapsed": true,
    "executionInfo": {
     "elapsed": 516,
     "status": "ok",
     "timestamp": 1537540920673,
     "user": {
      "displayName": "ronan sicre",
      "photoUrl": "https://lh3.googleusercontent.com/a/default-user=s128",
      "userId": "117556902235679898709"
     },
     "user_tz": -120
    },
    "id": "qGHZWsA2Rlh_",
    "outputId": "41ed4bee-e8d1-4319-ea49-3df5f864e89d"
   },
   "outputs": [],
   "source": [
    "from sklearn import datasets\n",
    "\n",
    "# loading datasets\n",
    "ir = datasets.load_iris()\n",
    "dig = datasets.load_digits()\n",
    "bc = datasets.load_breast_cancer()\n",
    "\n",
    "print(ir.data.shape)\n",
    "print(bc.data.shape)\n",
    "print(dig.data.shape)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "collapsed": true,
    "id": "VaXfzC6BRliG"
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "colab": {
   "collapsed_sections": [],
   "name": "DS_TP2_2018_v2.ipynb",
   "provenance": [],
   "version": "0.3.2"
  },
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.14"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}