| 
5 | 5 |     "colab": {  | 
6 | 6 |       "name": "FactorAnalysisPersonality.ipynb",  | 
7 | 7 |       "provenance": [],  | 
 | 8 | +      "toc_visible": true,  | 
8 | 9 |       "include_colab_link": true  | 
9 | 10 |     },  | 
10 | 11 |     "kernelspec": {  | 
 | 
24 | 25 |       ]  | 
25 | 26 |     },  | 
26 | 27 |     {  | 
27 |  | -      "cell_type": "code",  | 
 | 28 | +      "cell_type": "markdown",  | 
28 | 29 |       "metadata": {  | 
29 |  | -        "id": "_zn3Cg6ECVcW",  | 
30 |  | -        "colab_type": "code",  | 
31 |  | -        "colab": {}  | 
 | 30 | +        "id": "qTHBjB_hPxZ1",  | 
 | 31 | +        "colab_type": "text"  | 
32 | 32 |       },  | 
33 | 33 |       "source": [  | 
34 |  | -        ""  | 
35 |  | -      ],  | 
36 |  | -      "execution_count": 0,  | 
37 |  | -      "outputs": []  | 
 | 34 | +        "# Background on the data"  | 
 | 35 | +      ]  | 
 | 36 | +    },  | 
 | 37 | +    {  | 
 | 38 | +      "cell_type": "markdown",  | 
 | 39 | +      "metadata": {  | 
 | 40 | +        "id": "814cvzSKQBjL",  | 
 | 41 | +        "colab_type": "text"  | 
 | 42 | +      },  | 
 | 43 | +      "source": [  | 
 | 44 | +        "A personality test is given to a large group of people. There are 5 questions which deal with agreeableness (A1-5), 5 questions which deal with conscientiousness (C1-5), 5 questions which deal with Extraversion(E1-5), 5 questions which deal with Neuroticism (N1-5) and 5 questions which deal with Openness (O1-5). \n",  | 
 | 45 | +        "\n",  | 
 | 46 | +        "Those answers are directly related with one another.\n",  | 
 | 47 | +        "\n",  | 
 | 48 | +        "We want to see if based on people's answers, we can regroup those categories.\n",  | 
 | 49 | +        "If we can, we will end up with 5 factors and each factor will be influenced by the questions which correspond to it. \n",  | 
 | 50 | +        "\n"  | 
 | 51 | +      ]  | 
38 | 52 |     },  | 
39 | 53 |     {  | 
40 | 54 |       "cell_type": "markdown",  | 
 | 
71 | 85 |       "metadata": {  | 
72 | 86 |         "id": "tP4cR7ZrSuLc",  | 
73 | 87 |         "colab_type": "code",  | 
74 |  | -        "outputId": "d52f563c-c0d0-43ca-a322-8f1885cc7dc8",  | 
 | 88 | +        "outputId": "611e1c92-4bc6-4193-d21b-5b717b8bf307",  | 
75 | 89 |         "colab": {  | 
76 | 90 |           "base_uri": "https://localhost:8080/",  | 
77 | 91 |           "height": 207  | 
 | 
99 | 113 |         "gauth.credentials = GoogleCredentials.get_application_default()\n",  | 
100 | 114 |         "drive = GoogleDrive(gauth)"  | 
101 | 115 |       ],  | 
102 |  | -      "execution_count": 0,  | 
 | 116 | +      "execution_count": 2,  | 
103 | 117 |       "outputs": [  | 
104 | 118 |         {  | 
105 | 119 |           "output_type": "stream",  | 
106 | 120 |           "text": [  | 
107 | 121 |             "Collecting factor_analyzer==0.2.3\n",  | 
108 | 122 |             "  Downloading https://files.pythonhosted.org/packages/79/1b/84808bbeee0f3a8753c3d8034baf0aa0013cf08957eff750f366ce83f04a/factor_analyzer-0.2.3-py2.py3-none-any.whl\n",  | 
 | 123 | +            "Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from factor_analyzer==0.2.3) (1.16.5)\n",  | 
109 | 124 |             "Requirement already satisfied: pandas in /usr/local/lib/python3.6/dist-packages (from factor_analyzer==0.2.3) (0.24.2)\n",  | 
110 | 125 |             "Requirement already satisfied: scipy in /usr/local/lib/python3.6/dist-packages (from factor_analyzer==0.2.3) (1.3.1)\n",  | 
111 |  | -            "Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from factor_analyzer==0.2.3) (1.16.5)\n",  | 
112 | 126 |             "Requirement already satisfied: pytz>=2011k in /usr/local/lib/python3.6/dist-packages (from pandas->factor_analyzer==0.2.3) (2018.9)\n",  | 
113 | 127 |             "Requirement already satisfied: python-dateutil>=2.5.0 in /usr/local/lib/python3.6/dist-packages (from pandas->factor_analyzer==0.2.3) (2.5.3)\n",  | 
114 | 128 |             "Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.6/dist-packages (from python-dateutil>=2.5.0->pandas->factor_analyzer==0.2.3) (1.12.0)\n",  | 
 | 
126 | 140 |         "colab_type": "text"  | 
127 | 141 |       },  | 
128 | 142 |       "source": [  | 
129 |  | -        "Now we will load the data. This line reads in the comma separated value sheet that I made in excel.\n",  | 
 | 143 | +        "Now we will load the data. This line reads in the comma separated value sheet.\n",  | 
130 | 144 |         "\n",  | 
131 | 145 |         "If you are given sensor data in excel and want to export it to csv in the future, see this link:\n",  | 
132 | 146 |         "https://www.ablebits.com/office-addins-blog/2014/04/24/convert-excel-csv/"  | 
 | 
175 | 189 |       "metadata": {  | 
176 | 190 |         "id": "cbE21AiDU_5L",  | 
177 | 191 |         "colab_type": "code",  | 
178 |  | -        "outputId": "0f51b05e-0763-4115-8322-0d1ff60199f6",  | 
 | 192 | +        "outputId": "2a0c7e8c-e5f4-4a9e-f4ba-12ef316d9f9e",  | 
179 | 193 |         "colab": {  | 
180 | 194 |           "base_uri": "https://localhost:8080/",  | 
181 | 195 |           "height": 544  | 
 | 
200 | 214 |         "#calcualte the number of variables\n",  | 
201 | 215 |         "numVars = df.shape[1]-len(unnecessaryColumns)\n"  | 
202 | 216 |       ],  | 
203 |  | -      "execution_count": 0,  | 
 | 217 | +      "execution_count": 5,  | 
204 | 218 |       "outputs": [  | 
205 | 219 |         {  | 
206 | 220 |           "output_type": "stream",  | 
 | 
288 | 302 |       "metadata": {  | 
289 | 303 |         "id": "aXtTewA1WHwR",  | 
290 | 304 |         "colab_type": "code",  | 
291 |  | -        "outputId": "e579ef8c-4963-45a7-e4a2-2041627acdf1",  | 
 | 305 | +        "outputId": "3b7b6a91-717d-4ad4-e23b-2ee1316d3bb4",  | 
292 | 306 |         "colab": {  | 
293 | 307 |           "base_uri": "https://localhost:8080/",  | 
294 | 308 |           "height": 34  | 
 | 
300 | 314 |         "chi_square_value,p_value=calculate_bartlett_sphericity(df)\n",  | 
301 | 315 |         "chi_square_value, p_value"  | 
302 | 316 |       ],  | 
303 |  | -      "execution_count": 0,  | 
 | 317 | +      "execution_count": 6,  | 
304 | 318 |       "outputs": [  | 
305 | 319 |         {  | 
306 | 320 |           "output_type": "execute_result",  | 
 | 
312 | 326 |           "metadata": {  | 
313 | 327 |             "tags": []  | 
314 | 328 |           },  | 
315 |  | -          "execution_count": 4  | 
 | 329 | +          "execution_count": 6  | 
316 | 330 |         }  | 
317 | 331 |       ]  | 
318 | 332 |     },  | 
 | 
351 | 365 |       "metadata": {  | 
352 | 366 |         "id": "gJbu2WwCXBa0",  | 
353 | 367 |         "colab_type": "code",  | 
354 |  | -        "outputId": "d27d5cbe-3eb2-4dec-f19d-10d268bd818c",  | 
 | 368 | +        "outputId": "17522ec2-a2c3-4a8c-cb99-97e23c603420",  | 
355 | 369 |         "colab": {  | 
356 | 370 |           "base_uri": "https://localhost:8080/",  | 
357 | 371 |           "height": 34  | 
 | 
364 | 378 |         "\n",  | 
365 | 379 |         "kmo_model"  | 
366 | 380 |       ],  | 
367 |  | -      "execution_count": 0,  | 
 | 381 | +      "execution_count": 7,  | 
368 | 382 |       "outputs": [  | 
369 | 383 |         {  | 
370 | 384 |           "output_type": "execute_result",  | 
 | 
376 | 390 |           "metadata": {  | 
377 | 391 |             "tags": []  | 
378 | 392 |           },  | 
379 |  | -          "execution_count": 5  | 
 | 393 | +          "execution_count": 7  | 
380 | 394 |         }  | 
381 | 395 |       ]  | 
382 | 396 |     },  | 
 | 
425 | 439 |       "metadata": {  | 
426 | 440 |         "id": "Z6zPLCYTXNgL",  | 
427 | 441 |         "colab_type": "code",  | 
428 |  | -        "outputId": "49e91e74-94aa-4158-8d27-7437f971fac9",  | 
 | 442 | +        "outputId": "37c538c0-50ca-4b5d-b43d-177486fd110f",  | 
429 | 443 |         "colab": {  | 
430 | 444 |           "base_uri": "https://localhost:8080/",  | 
431 | 445 |           "height": 855  | 
 | 
439 | 453 |         "ev, v = fa.get_eigenvalues()\n",  | 
440 | 454 |         "ev"  | 
441 | 455 |       ],  | 
442 |  | -      "execution_count": 0,  | 
 | 456 | +      "execution_count": 8,  | 
443 | 457 |       "outputs": [  | 
444 | 458 |         {  | 
445 | 459 |           "output_type": "execute_result",  | 
 | 
608 | 622 |           "metadata": {  | 
609 | 623 |             "tags": []  | 
610 | 624 |           },  | 
611 |  | -          "execution_count": 6  | 
 | 625 | +          "execution_count": 8  | 
612 | 626 |         }  | 
613 | 627 |       ]  | 
614 | 628 |     },  | 
 | 
627 | 641 |       "metadata": {  | 
628 | 642 |         "id": "8EcInPmvXS4E",  | 
629 | 643 |         "colab_type": "code",  | 
630 |  | -        "outputId": "6050dd3b-4043-4c62-9c88-bcb005c4d5e7",  | 
 | 644 | +        "outputId": "6589d02c-5353-457b-a17b-c79dd90408d2",  | 
631 | 645 |         "colab": {  | 
632 | 646 |           "base_uri": "https://localhost:8080/",  | 
633 | 647 |           "height": 295  | 
 | 
643 | 657 |         "plt.grid()\n",  | 
644 | 658 |         "plt.show()"  | 
645 | 659 |       ],  | 
646 |  | -      "execution_count": 0,  | 
 | 660 | +      "execution_count": 9,  | 
647 | 661 |       "outputs": [  | 
648 | 662 |         {  | 
649 | 663 |           "output_type": "display_data",  | 
 | 
689 | 703 |       "metadata": {  | 
690 | 704 |         "id": "es4vNDC1XYdW",  | 
691 | 705 |         "colab_type": "code",  | 
692 |  | -        "outputId": "399e5d49-3f56-4be5-e704-5c7c6f660c6e",  | 
 | 706 | +        "outputId": "c5735c87-1ad2-4027-ead4-1dfeed33b5a4",  | 
693 | 707 |         "colab": {  | 
694 | 708 |           "base_uri": "https://localhost:8080/",  | 
695 | 709 |           "height": 855  | 
 | 
702 | 716 |         "fa.analyze(df, numFactors, rotation=\"varimax\")\n",  | 
703 | 717 |         "fa.loadings"  | 
704 | 718 |       ],  | 
705 |  | -      "execution_count": 0,  | 
 | 719 | +      "execution_count": 10,  | 
706 | 720 |       "outputs": [  | 
707 | 721 |         {  | 
708 | 722 |           "output_type": "execute_result",  | 
 | 
979 | 993 |           "metadata": {  | 
980 | 994 |             "tags": []  | 
981 | 995 |           },  | 
982 |  | -          "execution_count": 8  | 
 | 996 | +          "execution_count": 10  | 
983 | 997 |         }  | 
984 | 998 |       ]  | 
985 | 999 |     },  | 
 | 
998 | 1012 |       "metadata": {  | 
999 | 1013 |         "id": "jXtE5F5VaLvW",  | 
1000 | 1014 |         "colab_type": "code",  | 
1001 |  | -        "outputId": "ee31f049-282c-443d-9edc-4389d8ac7cf7",  | 
 | 1015 | +        "outputId": "5ebab41d-9386-42c6-de90-0ed765640d62",  | 
1002 | 1016 |         "cellView": "code",  | 
1003 | 1017 |         "colab": {  | 
1004 | 1018 |           "base_uri": "https://localhost:8080/",  | 
1005 |  | -          "height": 102  | 
 | 1019 | +          "height": 122  | 
1006 | 1020 |         }  | 
1007 | 1021 |       },  | 
1008 | 1022 |       "source": [  | 
 | 
1015 | 1029 |         "  contributions = [(np.round(factor[x],2),headings[x]) for x in descending if np.abs(factor[x])>factor_threshold]\n",  | 
1016 | 1030 |         "  print('Factor %d:'%(i+1),contributions)"  | 
1017 | 1031 |       ],  | 
1018 |  | -      "execution_count": 0,  | 
 | 1032 | +      "execution_count": 13,  | 
1019 | 1033 |       "outputs": [  | 
1020 | 1034 |         {  | 
1021 | 1035 |           "output_type": "stream",  | 
 | 
1059 | 1073 |         "colab_type": "text"  | 
1060 | 1074 |       },  | 
1061 | 1075 |       "source": [  | 
1062 |  | -        "Factor 1 is mostly influenced by the questions about extrovertedness (E); some parts of agreeableness, openness, and nueroticism, influence this, but it is mostly E.\n",  | 
 | 1076 | +        "Factor 1 is mostly influenced by the questions about extrovertedness (E); some parts of agreeableness, openness, and nueroticism, influence this, but it is mostly E. These were the original E questions.\n",  | 
1063 | 1077 |         "\n",  | 
1064 |  | -        "Factor 2 is mostly influenced by neuroticism (N), but smoe parts of extrovertedness and conscientiousness play a role.\n",  | 
 | 1078 | +        "Factor 2 is mostly influenced by neuroticism (N), but smoe parts of extrovertedness and conscientiousness play a role. These were the original N questions.\n",  | 
1065 | 1079 |         "\n",  | 
1066 | 1080 |         "Factor 3 is mostly influenced by conscientiousness, but extrovertedness can play a role.\n",  | 
 | 1081 | +        "These were the original C questions.\n",  | 
1067 | 1082 |         "\n",  | 
1068 | 1083 |         "Factor 4 is mostly influenced by openness but extrovertedness plays a role.\n",  | 
 | 1084 | +        "These were the original O questions.\n",  | 
1069 | 1085 |         "\n",  | 
1070 | 1086 |         "Factor 5 is mostly influenced by agreeableness but extrovertedness plays a role.\n",  | 
 | 1087 | +        "These were the original A questions.\n",  | 
1071 | 1088 |         "\n",  | 
1072 |  | -        "In summary, extrovertedness played a role in all of the factors so it is dominanting the system. However looking at individual factors, we can see that specific traits influence specific factors. "  | 
 | 1089 | +        "In summary, we were able to extract the original groupings however it was interesting to note the role that E plays in all of the questions. This could suggest that there is some bias or systematic error at play. "  | 
1073 | 1090 |       ]  | 
1074 | 1091 |     },  | 
1075 | 1092 |     {  | 
 | 
1082 | 1099 |         "# Computing the variance"  | 
1083 | 1100 |       ]  | 
1084 | 1101 |     },  | 
 | 1102 | +    {  | 
 | 1103 | +      "cell_type": "code",  | 
 | 1104 | +      "metadata": {  | 
 | 1105 | +        "id": "bxzTgYfzRvsp",  | 
 | 1106 | +        "colab_type": "code",  | 
 | 1107 | +        "colab": {}  | 
 | 1108 | +      },  | 
 | 1109 | +      "source": [  | 
 | 1110 | +        ""  | 
 | 1111 | +      ],  | 
 | 1112 | +      "execution_count": 0,  | 
 | 1113 | +      "outputs": []  | 
 | 1114 | +    },  | 
1085 | 1115 |     {  | 
1086 | 1116 |       "cell_type": "code",  | 
1087 | 1117 |       "metadata": {  | 
 | 
0 commit comments