Lunarnotes - SW12_InClass

{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "3KsVevPzGnxX",
   "metadata": {
    "id": "3KsVevPzGnxX"
   },
   "source": [
    "# SW12 InClass - Repetition and Programming\n",
    "\n",
    "This file is intended for interactive repetition in the first lesson in class. You write directly into the lines of code.\n",
    "\n",
    "***"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fb57p96R3V8L",
   "metadata": {
    "id": "fb57p96R3V8L"
   },
   "source": [
    "# 2.7 Random Variables"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "oYUyelxH3aXq",
   "metadata": {
    "id": "oYUyelxH3aXq"
   },
   "source": [
    "### Example 2.7.1\n",
    "In the following, the random variable (RV) $X:\\Omega\\to W_X$ is implemented as a Python function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "KnscTQqm36N0",
   "metadata": {
    "id": "KnscTQqm36N0"
   },
   "outputs": [],
   "source": [
    "Omega = {\"Ass\",\"Koenig\",\"Ober\",\"Under\",\"Banner\",\"Neun\",\"Acht\",\"Sieben\",\"Sechs\"}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3j0dwyIM5lEi",
   "metadata": {
    "id": "3j0dwyIM5lEi"
   },
   "outputs": [],
   "source": [
    "def X(omega):\n",
    "    if omega not in Omega:\n",
    "        return None\n",
    "    elif (omega==\"Ass\"):\n",
    "        return 11\n",
    "    elif (omega==\"Koenig\"):\n",
    "        return 4\n",
    "    elif (omega==\"Ober\"):\n",
    "        return 3\n",
    "    elif (omega==\"Under\"):\n",
    "        return 2\n",
    "    elif (omega==\"Banner\"):\n",
    "        return 10\n",
    "    elif (omega==\"Neun\"):\n",
    "        return 0\n",
    "    elif (omega==\"Acht\"):\n",
    "        return 0\n",
    "    elif (omega==\"Sieben\"):\n",
    "        return 0\n",
    "    elif (omega==\"Sechs\"):\n",
    "        return 0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "IonjfYUo58J2",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "executionInfo": {
     "elapsed": 224,
     "status": "ok",
     "timestamp": 1731850678561,
     "user": {
      "displayName": "Tommy Hunziker",
      "userId": "15630146278306291522"
     },
     "user_tz": -60
    },
    "id": "IonjfYUo58J2",
    "outputId": "d058cb75-6ef0-42fd-d47b-634da21a3b8d"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "11"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# RV X: Mapping from the sample space to the set of real numbers (or the set of integers)\n",
    "\n",
    "# Example:\n",
    "X(\"Ass\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "EMW9En7P2HOP",
   "metadata": {
    "id": "EMW9En7P2HOP"
   },
   "source": [
    "Questions:\n",
    "- which sample space $\\Omega$ do we have?\n",
    "- what is the value range of the RV $X$ ?"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eDCQHC-bLr2d",
   "metadata": {
    "id": "eDCQHC-bLr2d"
   },
   "source": [
    "# 2.8 Key Figures of a Distribution"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3TLbmOCxL_JF",
   "metadata": {
    "id": "3TLbmOCxL_JF"
   },
   "source": [
    "### Examples 2.8.1, 2.8.2\n",
    "Here we sum over the elements $\\omega\\in\\Omega$ using probability $P(\\omega)$:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9apIbypUMAho",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "executionInfo": {
     "elapsed": 248,
     "status": "ok",
     "timestamp": 1732720198423,
     "user": {
      "displayName": "Tommy Hunziker",
      "userId": "15630146278306291522"
     },
     "user_tz": -60
    },
    "id": "9apIbypUMAho",
    "outputId": "c7cdb28b-e653-4d18-ab53-4a5085ce57a2"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Sum: 1.0000000000000002\n",
      "Mean value: 3.333333333333333\n",
      "Variance: 16.666666666666664 ; standard deviation: 4.0824829046386295\n"
     ]
    }
   ],
   "source": [
    "# Calculation of expected value and variance\n",
    "\n",
    "def P(omega):\n",
    "    # return value: probability of X(omega)\n",
    "    if omega not in Omega:\n",
    "        return None\n",
    "    elif (omega==\"Ass\"):\n",
    "        return 1/9\n",
    "    elif (omega==\"Koenig\"):\n",
    "        return 1/9\n",
    "    elif (omega==\"Ober\"):\n",
    "        return 1/9\n",
    "    elif (omega==\"Under\"):\n",
    "        return 1/9\n",
    "    elif (omega==\"Banner\"):\n",
    "        return 1/9\n",
    "    elif (omega==\"Neun\"):\n",
    "        return 1/9\n",
    "    elif (omega==\"Acht\"):\n",
    "        return 1/9\n",
    "    elif (omega==\"Sieben\"):\n",
    "        return 1/9\n",
    "    elif (omega==\"Sechs\"):\n",
    "        return 1/9\n",
    "\n",
    "s = 0\n",
    "for omega in Omega:\n",
    "    s += P(omega)\n",
    "print(\"Sum:\",s)\n",
    "\n",
    "m = 0\n",
    "for omega in Omega:\n",
    "    m += X(omega)*P(omega)\n",
    "print(\"Mean value:\",m)\n",
    "\n",
    "v = 0\n",
    "for omega in Omega:\n",
    "    v += (X(omega)-m)**2*P(omega)\n",
    "print(\"Variance:\",v,\"; standard deviation:\",v**0.5)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "uK0Po978Ixnr",
   "metadata": {
    "id": "uK0Po978Ixnr"
   },
   "source": [
    "Here we sum over the elements $x\\in W_X$ by $P(\\{w\\in\\Omega\\ | X(\\omega)=x\\})$, written as $P(X=x)$:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "X69q9tbRI2C1",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "executionInfo": {
     "elapsed": 233,
     "status": "ok",
     "timestamp": 1732720257045,
     "user": {
      "displayName": "Tommy Hunziker",
      "userId": "15630146278306291522"
     },
     "user_tz": -60
    },
    "id": "X69q9tbRI2C1",
    "outputId": "9a37193b-7caf-406e-a749-1d10be3b32f9"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Sum: 1.0000000000000002\n",
      "Mean value: 3.333333333333333\n",
      "Variance: 16.666666666666668 ; standard deviation: 4.08248290463863\n"
     ]
    }
   ],
   "source": [
    "# Calculation of mean value and variance under the probability distribution given in Table 2.6\n",
    "\n",
    "def p(x):\n",
    "    if (x==0):\n",
    "        return 4/9\n",
    "    elif (x==2):\n",
    "        return 1/9\n",
    "    elif (x==3):\n",
    "        return 1/9\n",
    "    elif (x==4):\n",
    "        return 1/9\n",
    "    elif (x==10):\n",
    "        return 1/9\n",
    "    elif (x==11):\n",
    "        return 1/9\n",
    "    else:\n",
    "        return 0\n",
    "\n",
    "s = 0\n",
    "for x in {0,2,3,4,10,11}:\n",
    "    s += p(x)\n",
    "print(\"Sum:\",s)\n",
    "\n",
    "m = 0\n",
    "for x in {0,2,3,4,10,11}:\n",
    "    m += x*p(x)\n",
    "print(\"Mean value:\",m)\n",
    "\n",
    "v = 0\n",
    "for x in {0,2,3,4,10,11}:\n",
    "    v += (x-m)**2*p(x)\n",
    "print(\"Variance:\",v,\"; standard deviation:\",v**0.5)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "duBhbqgB5sEk",
   "metadata": {
    "id": "duBhbqgB5sEk"
   },
   "source": [
    "## Problem to example 2.8.3\n",
    "Plot the standard deviation of an RV with Bernoulli distribution as a function of $p$ and interpret the result."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c04344ea",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "..."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "y4hGAFNdlGhR",
   "metadata": {
    "id": "y4hGAFNdlGhR"
   },
   "source": [
    "# 2.9 Cumulative Distribution Function\n",
    "$$\n",
    "F(x)=P(X\\le x)=\\sum_{y\\le x}P(X=y)\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bd657575",
   "metadata": {
    "id": "bd657575"
   },
   "source": [
    "## Problem\n",
    "The random varianble $X$ represents the sum of two dice (with 6 faces each). Plot the funktion $F(x)$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "7ulzqTh0UgV8",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 447
    },
    "executionInfo": {
     "elapsed": 465,
     "status": "ok",
     "timestamp": 1732720429056,
     "user": {
      "displayName": "Tommy Hunziker",
      "userId": "15630146278306291522"
     },
     "user_tz": -60
    },
    "id": "7ulzqTh0UgV8",
    "outputId": "3423cb03-71bf-4954-dc8f-d50868ea7523"
   },
   "outputs": [],
   "source": [
    "# Calculation of cumulative distribution function using np.cumsum\n",
    "\n",
    "values_two_dice_sum = [-1,2,3,4,5,6,7,8,9,10,11,12,14]\n",
    "prob_two_dice_sum = ...\n",
    "cum_two_dice_sum = ...\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ezEbmzpDlzJ4",
   "metadata": {
    "id": "ezEbmzpDlzJ4"
   },
   "source": [
    "# 2.10 Binomial distribution\n",
    "Binomial coefficient:\n",
    "$$\n",
    "\\left(\\begin{array}{c}n\\\\k\\end{array}\\right)=\\frac{n!}{k!(n-k)!}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "EUlc-YzOnjBK",
   "metadata": {
    "id": "EUlc-YzOnjBK"
   },
   "source": [
    "### Example 2.10.2\n",
    "Computation Binomial coefficient in Python:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "_YV-8Y0lnhcF",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "executionInfo": {
     "elapsed": 230,
     "status": "ok",
     "timestamp": 1732720426603,
     "user": {
      "displayName": "Tommy Hunziker",
      "userId": "15630146278306291522"
     },
     "user_tz": -60
    },
    "id": "_YV-8Y0lnhcF",
    "outputId": "7d7d1a88-d12b-4a1c-8485-59eebc8b9aa1"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "6\n"
     ]
    }
   ],
   "source": [
    "import math\n",
    "\n",
    "print(math.comb(4,2))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "aPFDjaxFnxWy",
   "metadata": {
    "id": "aPFDjaxFnxWy"
   },
   "source": [
    "### Remark 2.10.1"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "LENTLirum68-",
   "metadata": {
    "id": "LENTLirum68-"
   },
   "source": [
    "Binomial$(n,p)$-distribution:\n",
    "$$\n",
    "P(X = x)=\\left(\\begin{array}{c}n\\\\x\\end{array}\\right)p^x(1−p)^{n−x}\\mbox{ für }x\\in\\{0,1,\\dots n\\}\n",
    "$$\n",
    "Computation of probability in Python:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "CrDxF3W0U73i",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "executionInfo": {
     "elapsed": 535,
     "status": "ok",
     "timestamp": 1723176570351,
     "user": {
      "displayName": "Tommy Hunziker",
      "userId": "15630146278306291522"
     },
     "user_tz": -420
    },
    "id": "CrDxF3W0U73i",
    "outputId": "86280743-e135-4dc4-f6f2-11488381a79d"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.3125\n"
     ]
    }
   ],
   "source": [
    "n = 5\n",
    "p = 0.5\n",
    "x = 3\n",
    "print(math.comb(n,x)*p**x*(1-p)**(n-x))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "l4MUwwDAVCfi",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "executionInfo": {
     "elapsed": 1859,
     "status": "ok",
     "timestamp": 1732720485085,
     "user": {
      "displayName": "Tommy Hunziker",
      "userId": "15630146278306291522"
     },
     "user_tz": -60
    },
    "id": "l4MUwwDAVCfi",
    "outputId": "2c92a1af-9e5e-46a9-bdbc-27a77e949882"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.31249999999999983\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "from scipy.stats import binom\n",
    "\n",
    "print(binom.pmf(k=3, n=5, p=.5))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6QuDxJFMn5wo",
   "metadata": {
    "id": "6QuDxJFMn5wo"
   },
   "source": [
    "### Remark 2.10.2\n",
    "Computation of all probabilities of the distribution:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cIndR37Sn9dd",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "executionInfo": {
     "elapsed": 281,
     "status": "ok",
     "timestamp": 1732720487225,
     "user": {
      "displayName": "Tommy Hunziker",
      "userId": "15630146278306291522"
     },
     "user_tz": -60
    },
    "id": "cIndR37Sn9dd",
    "outputId": "e1bdf966-1266-437c-feb1-5e7c6cd5799f"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[0.03125 0.15625 0.3125  0.3125  0.15625 0.03125]\n"
     ]
    }
   ],
   "source": [
    "n, p = 5, 0.5\n",
    "ran = np.arange(n+1)\n",
    "print(binom.pmf(ran, n, p))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "IZBTbDLNoJm8",
   "metadata": {
    "id": "IZBTbDLNoJm8"
   },
   "source": [
    "## Problems to example 2.10.3\n",
    "Berechnen Sie folgende Wahrscheinlichkeiten\n",
    "* $P(X=20)$\n",
    "* $P(X\\lt 20)$\n",
    "* $P(X\\gt 20)$\n",
    "* $P(X\\lt 16)$\n",
    "* $P(X\\gt 24)$\n",
    "* $P(16\\le X\\le 24)$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "UR7vT3CdoO5v",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "executionInfo": {
     "elapsed": 219,
     "status": "ok",
     "timestamp": 1732720488424,
     "user": {
      "displayName": "Tommy Hunziker",
      "userId": "15630146278306291522"
     },
     "user_tz": -60
    },
    "id": "UR7vT3CdoO5v",
    "outputId": "4cec80d7-868c-4a17-8d7d-3bb606723882"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Ellipsis"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "n, p = 100, 0.2\n",
    "\n",
    "..."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "CUaLhonso8Sv",
   "metadata": {
    "id": "CUaLhonso8Sv"
   },
   "source": [
    "### Remark 2.10.3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "LDknr2Goo-Ez",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "executionInfo": {
     "elapsed": 241,
     "status": "ok",
     "timestamp": 1732720605263,
     "user": {
      "displayName": "Tommy Hunziker",
      "userId": "15630146278306291522"
     },
     "user_tz": -60
    },
    "id": "LDknr2Goo-Ez",
    "outputId": "ff80efd1-48a1-4e1a-bc73-0b23a344296f"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.5397946186935895"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from scipy.stats import binom\n",
    "\n",
    "binom.cdf(k=50, n=100, p=.5)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "aiFjfmSnnexk",
   "metadata": {
    "id": "aiFjfmSnnexk"
   },
   "source": [
    "Question: Why is the above result not equal to $\\frac12$?"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "VuS1t9VpWGUo",
   "metadata": {
    "id": "VuS1t9VpWGUo"
   },
   "source": [
    "# Problem: Variance\n",
    "The following Python program generates a sequence of $n$ $(k,0.5)$-binomially distributed values."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "aeEIl2MCWFp9",
   "metadata": {
    "id": "aeEIl2MCWFp9"
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "def random_binom_10(n,k):\n",
    "    x = np.sum(np.random.randint(2,size=(n,k)),axis=1)\n",
    "    return x"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "rK8Ilm9IWKBG",
   "metadata": {
    "id": "rK8Ilm9IWKBG"
   },
   "source": [
    "* Calculate the empirical variance of the sequence.\n",
    "\n",
    "* Calculate numerically the mean value of the deviation of the empirical variance from the variance of the $(k,0.5)$ binomial distribution."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ffc071e9",
   "metadata": {},
   "outputs": [],
   "source": [
    "n = 20\n",
    "k = 10\n",
    "\n",
    "# empirical variance:\n",
    "x = random_binom_10(n,k)\n",
    "...\n",
    "\n",
    "# mean deviation from variance:\n",
    "..."
   ]
  }
 ],
 "metadata": {
  "colab": {
   "provenance": [
    {
     "file_id": "1VWo5AnXzVSfTBFx2L1drgzDypQ49_Bge",
     "timestamp": 1730810216348
    }
   ]
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}

SW12_InClass_EN