LUNAR
π
Dashboard
Browse
SW12_InClass_EN
JUPYTER
View Source
{ "cells": [ { "cell_type": "markdown", "id": "3KsVevPzGnxX", "metadata": { "id": "3KsVevPzGnxX" }, "source": [ "# SW12 InClass - Repetition and Programming\n", "\n", "This file is intended for interactive repetition in the first lesson in class. You write directly into the lines of code.\n", "\n", "***" ] }, { "cell_type": "markdown", "id": "fb57p96R3V8L", "metadata": { "id": "fb57p96R3V8L" }, "source": [ "# 2.7 Random Variables" ] }, { "cell_type": "markdown", "id": "oYUyelxH3aXq", "metadata": { "id": "oYUyelxH3aXq" }, "source": [ "### Example 2.7.1\n", "In the following, the random variable (RV) $X:\\Omega\\to W_X$ is implemented as a Python function." ] }, { "cell_type": "code", "execution_count": null, "id": "KnscTQqm36N0", "metadata": { "id": "KnscTQqm36N0" }, "outputs": [], "source": [ "Omega = {\"Ass\",\"Koenig\",\"Ober\",\"Under\",\"Banner\",\"Neun\",\"Acht\",\"Sieben\",\"Sechs\"}" ] }, { "cell_type": "code", "execution_count": null, "id": "3j0dwyIM5lEi", "metadata": { "id": "3j0dwyIM5lEi" }, "outputs": [], "source": [ "def X(omega):\n", " if omega not in Omega:\n", " return None\n", " elif (omega==\"Ass\"):\n", " return 11\n", " elif (omega==\"Koenig\"):\n", " return 4\n", " elif (omega==\"Ober\"):\n", " return 3\n", " elif (omega==\"Under\"):\n", " return 2\n", " elif (omega==\"Banner\"):\n", " return 10\n", " elif (omega==\"Neun\"):\n", " return 0\n", " elif (omega==\"Acht\"):\n", " return 0\n", " elif (omega==\"Sieben\"):\n", " return 0\n", " elif (omega==\"Sechs\"):\n", " return 0" ] }, { "cell_type": "code", "execution_count": null, "id": "IonjfYUo58J2", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "executionInfo": { "elapsed": 224, "status": "ok", "timestamp": 1731850678561, "user": { "displayName": "Tommy Hunziker", "userId": "15630146278306291522" }, "user_tz": -60 }, "id": "IonjfYUo58J2", "outputId": "d058cb75-6ef0-42fd-d47b-634da21a3b8d" }, "outputs": [ { "data": { "text/plain": [ "11" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# RV X: Mapping from the sample space to the set of real numbers (or the set of integers)\n", "\n", "# Example:\n", "X(\"Ass\")" ] }, { "cell_type": "markdown", "id": "EMW9En7P2HOP", "metadata": { "id": "EMW9En7P2HOP" }, "source": [ "Questions:\n", "- which sample space $\\Omega$ do we have?\n", "- what is the value range of the RV $X$ ?" ] }, { "cell_type": "markdown", "id": "eDCQHC-bLr2d", "metadata": { "id": "eDCQHC-bLr2d" }, "source": [ "# 2.8 Key Figures of a Distribution" ] }, { "cell_type": "markdown", "id": "3TLbmOCxL_JF", "metadata": { "id": "3TLbmOCxL_JF" }, "source": [ "### Examples 2.8.1, 2.8.2\n", "Here we sum over the elements $\\omega\\in\\Omega$ using probability $P(\\omega)$:" ] }, { "cell_type": "code", "execution_count": null, "id": "9apIbypUMAho", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "executionInfo": { "elapsed": 248, "status": "ok", "timestamp": 1732720198423, "user": { "displayName": "Tommy Hunziker", "userId": "15630146278306291522" }, "user_tz": -60 }, "id": "9apIbypUMAho", "outputId": "c7cdb28b-e653-4d18-ab53-4a5085ce57a2" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Sum: 1.0000000000000002\n", "Mean value: 3.333333333333333\n", "Variance: 16.666666666666664 ; standard deviation: 4.0824829046386295\n" ] } ], "source": [ "# Calculation of expected value and variance\n", "\n", "def P(omega):\n", " # return value: probability of X(omega)\n", " if omega not in Omega:\n", " return None\n", " elif (omega==\"Ass\"):\n", " return 1/9\n", " elif (omega==\"Koenig\"):\n", " return 1/9\n", " elif (omega==\"Ober\"):\n", " return 1/9\n", " elif (omega==\"Under\"):\n", " return 1/9\n", " elif (omega==\"Banner\"):\n", " return 1/9\n", " elif (omega==\"Neun\"):\n", " return 1/9\n", " elif (omega==\"Acht\"):\n", " return 1/9\n", " elif (omega==\"Sieben\"):\n", " return 1/9\n", " elif (omega==\"Sechs\"):\n", " return 1/9\n", "\n", "s = 0\n", "for omega in Omega:\n", " s += P(omega)\n", "print(\"Sum:\",s)\n", "\n", "m = 0\n", "for omega in Omega:\n", " m += X(omega)*P(omega)\n", "print(\"Mean value:\",m)\n", "\n", "v = 0\n", "for omega in Omega:\n", " v += (X(omega)-m)**2*P(omega)\n", "print(\"Variance:\",v,\"; standard deviation:\",v**0.5)" ] }, { "cell_type": "markdown", "id": "uK0Po978Ixnr", "metadata": { "id": "uK0Po978Ixnr" }, "source": [ "Here we sum over the elements $x\\in W_X$ by $P(\\{w\\in\\Omega\\ | X(\\omega)=x\\})$, written as $P(X=x)$:" ] }, { "cell_type": "code", "execution_count": null, "id": "X69q9tbRI2C1", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "executionInfo": { "elapsed": 233, "status": "ok", "timestamp": 1732720257045, "user": { "displayName": "Tommy Hunziker", "userId": "15630146278306291522" }, "user_tz": -60 }, "id": "X69q9tbRI2C1", "outputId": "9a37193b-7caf-406e-a749-1d10be3b32f9" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Sum: 1.0000000000000002\n", "Mean value: 3.333333333333333\n", "Variance: 16.666666666666668 ; standard deviation: 4.08248290463863\n" ] } ], "source": [ "# Calculation of mean value and variance under the probability distribution given in Table 2.6\n", "\n", "def p(x):\n", " if (x==0):\n", " return 4/9\n", " elif (x==2):\n", " return 1/9\n", " elif (x==3):\n", " return 1/9\n", " elif (x==4):\n", " return 1/9\n", " elif (x==10):\n", " return 1/9\n", " elif (x==11):\n", " return 1/9\n", " else:\n", " return 0\n", "\n", "s = 0\n", "for x in {0,2,3,4,10,11}:\n", " s += p(x)\n", "print(\"Sum:\",s)\n", "\n", "m = 0\n", "for x in {0,2,3,4,10,11}:\n", " m += x*p(x)\n", "print(\"Mean value:\",m)\n", "\n", "v = 0\n", "for x in {0,2,3,4,10,11}:\n", " v += (x-m)**2*p(x)\n", "print(\"Variance:\",v,\"; standard deviation:\",v**0.5)" ] }, { "cell_type": "markdown", "id": "duBhbqgB5sEk", "metadata": { "id": "duBhbqgB5sEk" }, "source": [ "## Problem to example 2.8.3\n", "Plot the standard deviation of an RV with Bernoulli distribution as a function of $p$ and interpret the result." ] }, { "cell_type": "code", "execution_count": null, "id": "c04344ea", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt\n", "\n", "..." ] }, { "cell_type": "markdown", "id": "y4hGAFNdlGhR", "metadata": { "id": "y4hGAFNdlGhR" }, "source": [ "# 2.9 Cumulative Distribution Function\n", "$$\n", "F(x)=P(X\\le x)=\\sum_{y\\le x}P(X=y)\n", "$$" ] }, { "cell_type": "markdown", "id": "bd657575", "metadata": { "id": "bd657575" }, "source": [ "## Problem\n", "The random varianble $X$ represents the sum of two dice (with 6 faces each). Plot the funktion $F(x)$." ] }, { "cell_type": "code", "execution_count": 1, "id": "7ulzqTh0UgV8", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 447 }, "executionInfo": { "elapsed": 465, "status": "ok", "timestamp": 1732720429056, "user": { "displayName": "Tommy Hunziker", "userId": "15630146278306291522" }, "user_tz": -60 }, "id": "7ulzqTh0UgV8", "outputId": "3423cb03-71bf-4954-dc8f-d50868ea7523" }, "outputs": [], "source": [ "# Calculation of cumulative distribution function using np.cumsum\n", "\n", "values_two_dice_sum = [-1,2,3,4,5,6,7,8,9,10,11,12,14]\n", "prob_two_dice_sum = ...\n", "cum_two_dice_sum = ...\n" ] }, { "cell_type": "markdown", "id": "ezEbmzpDlzJ4", "metadata": { "id": "ezEbmzpDlzJ4" }, "source": [ "# 2.10 Binomial distribution\n", "Binomial coefficient:\n", "$$\n", "\\left(\\begin{array}{c}n\\\\k\\end{array}\\right)=\\frac{n!}{k!(n-k)!}\n", "$$" ] }, { "cell_type": "markdown", "id": "EUlc-YzOnjBK", "metadata": { "id": "EUlc-YzOnjBK" }, "source": [ "### Example 2.10.2\n", "Computation Binomial coefficient in Python:" ] }, { "cell_type": "code", "execution_count": null, "id": "_YV-8Y0lnhcF", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "executionInfo": { "elapsed": 230, "status": "ok", "timestamp": 1732720426603, "user": { "displayName": "Tommy Hunziker", "userId": "15630146278306291522" }, "user_tz": -60 }, "id": "_YV-8Y0lnhcF", "outputId": "7d7d1a88-d12b-4a1c-8485-59eebc8b9aa1" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "6\n" ] } ], "source": [ "import math\n", "\n", "print(math.comb(4,2))" ] }, { "cell_type": "markdown", "id": "aPFDjaxFnxWy", "metadata": { "id": "aPFDjaxFnxWy" }, "source": [ "### Remark 2.10.1" ] }, { "cell_type": "markdown", "id": "LENTLirum68-", "metadata": { "id": "LENTLirum68-" }, "source": [ "Binomial$(n,p)$-distribution:\n", "$$\n", "P(X = x)=\\left(\\begin{array}{c}n\\\\x\\end{array}\\right)p^x(1βp)^{nβx}\\mbox{ fΓΌr }x\\in\\{0,1,\\dots n\\}\n", "$$\n", "Computation of probability in Python:" ] }, { "cell_type": "code", "execution_count": null, "id": "CrDxF3W0U73i", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "executionInfo": { "elapsed": 535, "status": "ok", "timestamp": 1723176570351, "user": { "displayName": "Tommy Hunziker", "userId": "15630146278306291522" }, "user_tz": -420 }, "id": "CrDxF3W0U73i", "outputId": "86280743-e135-4dc4-f6f2-11488381a79d" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.3125\n" ] } ], "source": [ "n = 5\n", "p = 0.5\n", "x = 3\n", "print(math.comb(n,x)*p**x*(1-p)**(n-x))" ] }, { "cell_type": "code", "execution_count": null, "id": "l4MUwwDAVCfi", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "executionInfo": { "elapsed": 1859, "status": "ok", "timestamp": 1732720485085, "user": { "displayName": "Tommy Hunziker", "userId": "15630146278306291522" }, "user_tz": -60 }, "id": "l4MUwwDAVCfi", "outputId": "2c92a1af-9e5e-46a9-bdbc-27a77e949882" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.31249999999999983\n" ] } ], "source": [ "import numpy as np\n", "from scipy.stats import binom\n", "\n", "print(binom.pmf(k=3, n=5, p=.5))" ] }, { "cell_type": "markdown", "id": "6QuDxJFMn5wo", "metadata": { "id": "6QuDxJFMn5wo" }, "source": [ "### Remark 2.10.2\n", "Computation of all probabilities of the distribution:" ] }, { "cell_type": "code", "execution_count": null, "id": "cIndR37Sn9dd", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "executionInfo": { "elapsed": 281, "status": "ok", "timestamp": 1732720487225, "user": { "displayName": "Tommy Hunziker", "userId": "15630146278306291522" }, "user_tz": -60 }, "id": "cIndR37Sn9dd", "outputId": "e1bdf966-1266-437c-feb1-5e7c6cd5799f" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0.03125 0.15625 0.3125 0.3125 0.15625 0.03125]\n" ] } ], "source": [ "n, p = 5, 0.5\n", "ran = np.arange(n+1)\n", "print(binom.pmf(ran, n, p))" ] }, { "cell_type": "markdown", "id": "IZBTbDLNoJm8", "metadata": { "id": "IZBTbDLNoJm8" }, "source": [ "## Problems to example 2.10.3\n", "Berechnen Sie folgende Wahrscheinlichkeiten\n", "* $P(X=20)$\n", "* $P(X\\lt 20)$\n", "* $P(X\\gt 20)$\n", "* $P(X\\lt 16)$\n", "* $P(X\\gt 24)$\n", "* $P(16\\le X\\le 24)$" ] }, { "cell_type": "code", "execution_count": 2, "id": "UR7vT3CdoO5v", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "executionInfo": { "elapsed": 219, "status": "ok", "timestamp": 1732720488424, "user": { "displayName": "Tommy Hunziker", "userId": "15630146278306291522" }, "user_tz": -60 }, "id": "UR7vT3CdoO5v", "outputId": "4cec80d7-868c-4a17-8d7d-3bb606723882" }, "outputs": [ { "data": { "text/plain": [ "Ellipsis" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "n, p = 100, 0.2\n", "\n", "..." ] }, { "cell_type": "markdown", "id": "CUaLhonso8Sv", "metadata": { "id": "CUaLhonso8Sv" }, "source": [ "### Remark 2.10.3" ] }, { "cell_type": "code", "execution_count": null, "id": "LDknr2Goo-Ez", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "executionInfo": { "elapsed": 241, "status": "ok", "timestamp": 1732720605263, "user": { "displayName": "Tommy Hunziker", "userId": "15630146278306291522" }, "user_tz": -60 }, "id": "LDknr2Goo-Ez", "outputId": "ff80efd1-48a1-4e1a-bc73-0b23a344296f" }, "outputs": [ { "data": { "text/plain": [ "0.5397946186935895" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from scipy.stats import binom\n", "\n", "binom.cdf(k=50, n=100, p=.5)\n" ] }, { "cell_type": "markdown", "id": "aiFjfmSnnexk", "metadata": { "id": "aiFjfmSnnexk" }, "source": [ "Question: Why is the above result not equal to $\\frac12$?" ] }, { "cell_type": "markdown", "id": "VuS1t9VpWGUo", "metadata": { "id": "VuS1t9VpWGUo" }, "source": [ "# Problem: Variance\n", "The following Python program generates a sequence of $n$ $(k,0.5)$-binomially distributed values." ] }, { "cell_type": "code", "execution_count": null, "id": "aeEIl2MCWFp9", "metadata": { "id": "aeEIl2MCWFp9" }, "outputs": [], "source": [ "import numpy as np\n", "\n", "def random_binom_10(n,k):\n", " x = np.sum(np.random.randint(2,size=(n,k)),axis=1)\n", " return x" ] }, { "cell_type": "markdown", "id": "rK8Ilm9IWKBG", "metadata": { "id": "rK8Ilm9IWKBG" }, "source": [ "* Calculate the empirical variance of the sequence.\n", "\n", "* Calculate numerically the mean value of the deviation of the empirical variance from the variance of the $(k,0.5)$ binomial distribution." ] }, { "cell_type": "code", "execution_count": null, "id": "ffc071e9", "metadata": {}, "outputs": [], "source": [ "n = 20\n", "k = 10\n", "\n", "# empirical variance:\n", "x = random_binom_10(n,k)\n", "...\n", "\n", "# mean deviation from variance:\n", "..." ] } ], "metadata": { "colab": { "provenance": [ { "file_id": "1VWo5AnXzVSfTBFx2L1drgzDypQ49_Bge", "timestamp": 1730810216348 } ] }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.5" } }, "nbformat": 4, "nbformat_minor": 5 }