SMM540 Group Coursework 1 2022
Deadline 4 November 2022
1. Consider the dataset ObesityData.csv available on Moodle. For a description of the variables refer
to the pdf file ObesityData_description.
Perform a principal component analysis and report your conclusions. (Not more than 250 words,
excluding R codes, plots, tables etc.).
[Total marks: 15]
2. On the same dataset (ObesityData.csv) perform a multidimensional scaling, comment on your re-
sults and compare with those obtained using the PCA above. (Not more than 250 words, excluding
R codes, plots, tables etc.)
[Total marks: 10]
3. A factor analysis was carried out of a data matrix of variables relating to the occupational and
educational status of three generations of family members. A description of the ten variables used
in the analysis is given in Table 1.
Variable Generation Code Description
x1 1 HF/O Husband’s father’s occupational status
x2 1 WF/O Wife’s father’s occupational status
x3 2 H/FE Husband’s further education
x4 2 H/Q Husband’s qualifications
x5 2 H/O Husband’s occupational status
x6 2 W/FE Wife’s further education
x7 2 W/Q Wife’s qualifications
x8 3 FB/FE Firstborn’s further education
x9 3 FB/Q Firstborn’s qualifications
x10 3 FB/O Firstborn’s occupational status
Table 1: Descriptions of social mobility variables.
(a) The unrotated factor loadings obtained from the three-factor model are given in Table 2.
Interpret them. [Marks: 7.5]
α?i1 α?i2 α?i3
x1 HF/O 0.426 0.403 0.053
x2 WF/O 0.404 0.343 0.008
x3 H/FE 0.592 -0.026 0.116
x4 H/Q 0.558 -0.240 0.118
x5 H/O 0.575 0.481 0.031
x6 W/FE 0.451 -0.126 0.369
x7 W/Q 0.477 -0.296 0.462
x8 FB/FE 0.615 -0.191 -0.289
x9 FB/Q 0.519 -0.358 -0.381
x10 FB/O 0.602 0.168 -0.219
Table 2: Loading matrix giving the unrotated loadings from a three-factor model of the social mobility
data.
(b) Rotations can be carried out to determine whether simple structure can be achieved. The
factor loadings obtained from an orthogonal (varimax) rotation and an oblique (oblimin)
rotation of the three-factor solution are shown in Tables 3 and 4. Comment on the results.
[Marks: 7.5]
(Not more than 250 words.)
[Total marks: 15]
α?i1 α?i2 α?i3
x1 HF/O 0.576 0.042 0.111
x2 WF/O 0.516 0.086 0.090
x3 H/FE 0.329 0.288 0.416
x4 H/Q 0.135 0.360 0.485
x5 H/O 0.728 0.113 0.144
x6 W/FE 0.163 0.078 0.568
x7 W/Q 0.042 0.106 0.718
x8 FB/FE 0.209 0.645 0.194
x9 FB/Q 0.018 0.723 0.140
x10 FB/O 0.491 0.434 0.098
Table 3: Loading matrices giving the varimax rotated loadings from a three-factor model of the social
mobility data
α?i1 α?i2 α?i3
x1 HF/O 0-0.064 0.599 0.025
x2 WF/O -0.003 0.530 0.002
x3 H/FE 0.183 0.246 0.353
x4 H/Q 0.279 0.015 0.445
x5 H/O -0.016 0.747 0.025
x6 W/FE -0.051 0.074 0.585
x7 W/Q -0.032 -0.085 0.765
x8 FB/FE 0.637 0.101 0.058
x9 FB/Q 0.762 -0.109 0.014
x10 FB/O 0.381 0.452 -0.052
Table 4: Loading matrices giving the oblimin rotated loadings from a three-factor model of the social
mobility data.
4. Read the paper ”Modeling longevity risks using a principal component approach: A comparison
with existing stochastic mortality models” by Sharon S. Yanga, Jack C. Yueb and Hong-Chih
Huangc (2010) available on Moodle. Write a report (not more than 3 pages) summarising the
goals of the work [2], the data source [2], the methods used [4], the results of the analysis and the
conclusions [6]. Also, comment on the robustness and generality of the results, the limitations of
the analysis and possible improvements [6].
[Total marks: 20]