返回

民情指數計算方法 (2023年7月4日)

Methodology of PSI (2023-07-04)

「民情指數」基本概念

香港民研在2012年制定「民情指數」(PSI),目的在於量化香港市民對香港社會的情緒反應,以解釋及預視社會出現集體行動的可能性。民情指數包涵了「政通」和「人和」兩個概念,分別以「政評數值(GA)」和「社評數值(SA)」顯示。政評數值泛指市民對整體政府管治的表現評價,而社評數值則泛指市民對整體社會狀況的評價。民情指數由十項民意數字組合而成,數據來源始於1992年7月,累積數據超過30年。

在「政通」方面,政評數值涵蓋4條具指標作用的問題,分別為:

GA1: 請你對港督彭定康/特首董建華/特首曾蔭權/特首梁振英/特首林鄭月娥/特首李家超嘅支持程度給予評分,0分代表絕對唔支持,100分代表絕對支持,50分代表一半半,你會比幾多分港督彭定康/特首董建華/特首曾蔭權/特首梁振英/特首林鄭月娥/特首李家超?
GA2: 假設明天選舉特首,而你又有權投票,你會唔會選董建華/曾蔭權/梁振英/林鄭月娥/李家超做特首?
GA3: 你對特區政府嘅整體表現滿唔滿意?(追問程度)
GA4: 整體嚟講,你信唔信任香港政府/香港特區政府?(追問程度)

在「人和」方面,社評數值涵蓋另外6條具指標作用的問題,分別為:

SA1: 整體嚟講,你對香港而家嘅政治狀況有幾滿意或者不滿?(追問程度)
SA2: 整體嚟講,你對香港而家嘅經濟狀況有幾滿意或者不滿?(追問程度)
SA3: 整體嚟講,你對香港而家嘅社會/民生狀況有幾滿意或者不滿?(追問程度)
SA4-1: 請你用0至10分評價政治狀況對你滿唔滿意香港社會整體狀況有幾重要,0分代表完全唔重要,10分代表十分重要,5分代表一般重要。你畀幾多分政治狀況嘅重要程度?
SA4-2: 請你用0至10分評價經濟狀況對你滿唔滿意香港社會整體狀況有幾重要,0分代表完全唔重要,10分代表十分重要,5分代表一般重要。你畀幾多分經濟狀況嘅重要程度?
SA4-3: 請你用0至10分評價民生狀況對你滿唔滿意香港社會整體狀況有幾重要,0分代表完全唔重要,10分代表十分重要,5分代表一般重要。你畀幾多分民生狀況嘅重要程度?

 

「民情指數」計算方法

第一步是把上述10條問題所得數據以下述方法各自轉化成為單一數字:

GA1(非標準化): 

計算這個問題中有效樣本的平均值,得出一個初始值為0~100的數字

GA2(非標準化):  

將回答「會」的百分比減去「不會」的百分比,得出這個問題中所有有效樣本的淨支持值,初始值為-100 ~ +100

GA3、GA4、SA1、SA2、SA3(非標準化)[1]

將五等量尺答案按照正面程度,以1分最低、5分最高量化成為1、2、3、4、5分,再計算每個問題的有效樣本的平均值,得出初始值為1~5的數字

SA4-1、SA4-2、SA4-3(非標準化及轉化值):

首先,分別計算每個問題中有效評分的平均值,範圍為0~10,然後分別除以三個平均值的總和,範圍為0~30,從而得到3個轉化值。每個轉化值範圍為0~1,其總和等於1。

[1] 2012年或之前,如果用於計算非標準化的社評數值的所有6個指標在某一時期沒有更新,香港民研將使用同一時期中非標準化的政評數值,以簡單的線性回歸法推算出非標準化的社評數值。自2013年起,此方法改為直接採用最新公佈的數字。


第二步是把所有從最初的量化過程中獲得的數字通過以下方法進一步處理,以產生標準化及最終數字:

GA1、GA2、GA3、GA4、SA1、SA2、SA3(標準化):

根據從1992年以來直到早一個月獲得的研究結果,每個轉化的數字都被標準化,轉化為正態分布,平均值設定為100,標準差設定為15,亦即每個數字都被轉化為符合所述正態曲線的另一個數字。

非標準化的政評數值(GA):

未標準化的政評數值是通過選取GA1、GA2、GA3和GA4已轉化值的平均值來計算,每個值都符合正態曲線。正態曲線平均值設置為100,標準差設置為15。

最終政評數值(GA):

根據從1992年以來直到早一個月獲得的研究結果,對未標準化數字進行標準化程序,將其轉化為正態分布,其平均值設定為100,標準差設定為15。完成後獲得最終的政評數值。

非標準化的社評數值(SA):

以轉化為0~1的SA4-1、SA4-2、SA4-3的權重來計算非標準化的社評數值,計算公式如下:非標準化的社評數值 = (標準化_SA1 × 轉化值_SA4-1) + (標準化_SA2 × 轉化值_SA4-2) + (標準化_SA3 × 轉化值_SA4-3)。

最終社評數值(SA):

根據從1992年以來直到早一個月獲得的研究結果,對未標準化數字進行標準化程序,將其轉化為正態分布,其平均值設定為100,標準差設定為15。完成後獲得最終的社評數值。

最終民情指數(PSI):

未標準化的民情指數是通過選取最終的政評數值和最終的社評數值的平均值來計算,然後根據自1992年以來直到早一個月獲得的研究結果進行標準化程序,轉化為正態分布。正態分布的平均值設定為100,標準差設定為15。

缺數處理和方法更新

由於部分民情指數的成份調查項目在1992年尚未開展,這些調查項目在缺數階段會被撇除,而SA4部分則會在缺數階段全部假設為三分之一。在有關調查項目開始後,如果相關民意數字在計算指數時沒有更新,香港民研會採用最近一次已公佈的數字替代。至於各項數據的標準化過程,第一代民情指數基本是以1992年7月為起點,然後以某些特首任期結束的日子為轉接,成為用作標準化的數據庫,以下為簡略說明:

特首及任期 民情指數計算時期 標準化數據庫涵蓋年份 標準化數據庫涵蓋年期
彭定康
(1992-1997)
1992年7月至1997年6月[2] 1992年7月至2012年6月 20年
董建華
(1997-2005)
1997年7月至2005年3月[2] 1992年7月至2012年6月 20年
曾蔭權
(2005-2012)
2005年6月至2012年6月[2] 1992年7月至2012年6月 20年
梁振英
(2012-2017)
2012年7月至2017年6月 1992年7月至2012年6月 20年
林鄭月娥
(2017-2022)
2017年7月至2022年6月 1992年7月至2017年6月 25年

[2] 由於民情指數在2012年才開始使用,這些早期數值需要以追溯形式運算得出。


及至第二代,民情指數的標準化數據庫依然是以1992年7月為起點,但就以最早五年為第一個標準化數據庫,然後每月累積下去,簡略說明如下:

特首及任期 民情指數計算時期 標準化數據庫涵蓋年份 標準化數據庫涵蓋月數
彭定康
(1992-1997)
1992年7月至1997年6月[3] 1992年7月至1997年6月 60個月
董建華
(1997-2005)
1997年7月[3] 1992年7月至1997年6月 60個月
1997年8月[3] 1992年7月至1997年7月… 61個月…
曾蔭權
(2005-2012)
2005年6月[3] 1992年7月至2005年5月 155個月
2005年7月[3] 1992年7月至2005年6月… 156個月…
梁振英
(2012-2017)
2012年7月 1992年7月至2012年6月 240個月
2012年8月… 1992年7月至2012年7月… 241個月…
林鄭月娥
(2017-2022)
2017年7月 1992年7月至2017年6月 300個月
2017年8月… 1992年7月至2017年7月… 301個月…
李家超
(2022- )
2022年7月… 1992年7月至2022年6月… 360個月…
2023年6月 1992年7月至2023年5月 371個月

[3] 由於民情指數在2012年才開始使用,這些早期數值需要以追溯形式運算得出。

數值理解

民情指數、政評數值及社評數值的標準化過程,皆以正態分布為準,平均值設定為100,標準差設定為15,與人類智商(IQ)的分布形態看齊,亦即每個數字都被轉化為符合所述正態曲線的另一個數字。數字愈低,代表民情愈差,數字愈高,則代表民情愈佳,中間正常水平則為100。具體數值可按下表理解:

指數數值 百分位數 指數數值 百分位數
140+ 最高1% 60- 最低1%
125 最高5% 75 最低5%
120 最高10% 80 最低10%
110 最高25% 90 最低25%
100為正常數值,即半數在上,半數在下

 

 

Basic Concepts

In 2012, HKPORI compiled the “Public Sentiment Index (PSI)” with an aim to quantify Hong Kong people’s sentiments, in order to explain and predict the likelihood of mass movements. PSI comprises 2 components: one being Government Appraisal (GA) Score and the other being Society Appraisal (SA) Score. GA refers to people’s appraisal of society’s governance while SA refers to people’s appraisal of the social environment. PSI comprises 10 public opinion indicators, with data collected since July 1992, meaning over 30 years of accumulated data.

For “Government Appraisal”, there are 4 indicator questions, as follows:

GA1: Please use a scale of 0-100 to rate your extent of support to Governor Chris Patten / Chief Executive (CE) Tung Chee-hwa / CE Donald Tsang / CE Leung Chun-ying / CE Carrie Lam / CE John Lee, with 0 indicating absolutely not supportive, 100 indicating absolutely supportive and 50 indicating half-half. How would you rate the Governor Chris Patten / Chief Executive (CE) Tung Chee-hwa / CE Donald Tsang / CE Leung Chun-ying / CE Carrie Lam / CE John Lee?
GA2: If a general election of the Chief Executive were to be held tomorrow, and you had the right to vote, would you vote for Tung Chee-hwa / Donald Tsang / Leung Chun-ying / Carrie Lam / John Lee?
GA3: Are you satisfied with the performance of the HKSAR government? (Interviewer to probe intensity)
GA4: On the whole, do you trust the Hong Kong/Hong Kong SAR government? (Interviewer to probe intensity)

For “Society Appraisal”, there are these 6 indicator questions:

SA1: Generally speaking, how much are you satisfied or dissatisfied with the current political condition in Hong Kong? (Interviewer to probe intensity)
SA2: Generally speaking, how much are you satisfied or dissatisfied with the current economic condition in Hong Kong? (Interviewer to probe intensity)
SA3: Generally speaking, how much are you satisfied or dissatisfied with the current livelihood condition in Hong Kong? (Interviewer to probe intensity)
SA4-1: Please rate on the scale of 0-10 the importance of political condition in your overall satisfaction with Hong Kong’s societal condition, with 0 meaning absolutely not important, 10 meaning absolutely important, 5 meaning moderately important. How would you rate the importance of political condition?
SA4-2: Please rate on the scale of 0-10 the importance of economic condition in your overall satisfaction with Hong Kong’s societal condition, with 0 meaning absolutely not important, 10 meaning absolutely important, 5 meaning moderately important. How would you rate the importance of economic condition?
SA4-3: Please rate on the scale of 0-10 the importance of livelihood condition in your overall satisfaction with Hong Kong’s societal condition, with 0 meaning absolutely not important, 10 meaning absolutely important, 5 meaning moderately important. How would you rate to the importance of livelihood condition?
Computation Method

Step One is to quantify the data from the 10 questions into numbers using the following method:

GA1 (unstandardized):

Calculate the mean of valid cases for this question, resulting in a number with initial value ranging 0~100.

GA2 (unstandardized):

Subtract the “No” percentage from the “Yes” percentage to obtain the net support value among valid cases for this question, which is a number with initial value ranging -100 ~ +100.

GA3, GA4, SA1, SA2, SA3 (unstandardized) [1]:

Quantify the individual responses into 1, 2, 3, 4, 5 marks according to their degree of positive level, where 1 is the lowest and 5 the highest, and then calculate the means of valid cases for each of these questions, resulting in numbers with initial values each ranging 1~5.

SA4-1, SA4-2, SA4-3 (unstandardized and transformed values):

First calculate the mean value of each question for valid ratings for each of these questions separately, ranging 0~10, then divide each of them by the sum of the three mean values, ranging 0~30, to obtain 3 transformed values each ranging 0~1, with their total sum equal to 1.

[1] Prior to 2012, if the 6 indicators of unstandardized SA score had not been updated, HKPORI would use simple linear regression to extrapolate the unstandardized SA score from the unstandardized GA score of the same time period. Starting from 2013, this method has been replaced by the direct adoption of the most recent announced data instead.


Step Two is to obtain the standardized and final scores from the numbers obtained from the initial quantification process:

GA1, GA2, GA3, GA4, SA1, SA2, SA3 (standardized):

Each of the transformed numbers was standardized according to a scheme derived from previous findings obtained since 1992 up to the month before and transformed to a normal distribution with the mean value set at 100 and standard deviation set at 15, meaning that each number was transformed into another number fitting the normal curve described.

Unstandardized GA:

An unstandardized GA score was calculated by simply taking the mean of the transformed values of GA1, GA2, GA3 and GA4, each fitting the normal curve with mean value set at 100 and standard deviation set at 15.

Final GA:

Unstandardized GA was then standardized according to a scheme derived from previous findings obtained since 1992 up to the month before and transformed to a normal distribution with the mean value set at 100 and standard deviation set at 15, to obtain the final GA score.

Unstandardized SA:

The transformed SA4-1, SA4-2, SA4-3 each ranging 0~1 were used as weights to calculate an unstandardized SA score using this formula:

(Standardized_SA1 × Transformed_SA4-1) + (Standardized_SA2 × Transformed_SA4-2)
+ (Standardized_SA3 × Transformed_SA4-3)

Final SA:

Unstandardized SA was then standardized according to a scheme derived from previous findings obtained since 1992 up to the month before and transformed to a normal distribution with the mean value set at 100 and standard deviation set at 15, to obtain the final SA score.

Final PSI:

An unstandardized PSI score was calculated by simply taking the mean of the final GA and final SA, and then standardized according to a scheme derived from previous findings obtained since 1992 up to the month before and transformed to a normal distribution with the mean value set at 100 and standard deviation set at 15.

Handling of Missing Data and Revision of Computation Method

Since some survey series were not yet started in 1992, those items would be excluded as missing data in that stage, while the value of SA4 was assumed to be one-third. After the commencement of those survey series, if some data was not updated when calculating the indices, their values would be imputed from the most recent data. As for the standardization of various values, for the first generation of PSI, HKPORI basically takes July 1992 as a starting point, and then takes the end date of certain CE’s term of office as the end point to generate the standardization database. The following table briefly explains:

CE and term time Period of PSI calculation Covered period of standardization database Years covered in the database
Chris Patten
(1992-1997)
July 1992 to June 1997[2] July 1992 to June 2012 20 years
Tung Chee-hwa
(1997-2005)
July 1997 to March 2005[2] July 1992 to June 2012 20 years
Donald Tsang
(2005-2012)
June 2005 to June 2012[2] July 1992 to June 2012 20 years
CY Leung
(2012-2017)
July 2012 to June 2017 July 1992 to June 2012 20 years
Carrie Lam
(2017-2022)
July 2017 to June 2022 July 1992 to June 2017 25 years

[2] As the PSI was used only after 2012, the earlier values need to be computed in retrospect.

When it comes to the second generation of PSI, HKPORI still takes July 1992 as a starting point, but will take the first five years of data to generate the standardization database, and then keep it growing month by month. The following table briefly explains:

CE and term time Period of PSI calculation Covered period of standardization database Months covered in the database
Chris Patten
(1992-1997)
July 1992 to June 1997[3] July 1992 to June 1997 60 months
Tung Chee-hwa
(1997-2005)
July 1997[3] July 1992 to June 1997 60 months
August 1997[3] July 1992 to July 1997… 61 months…
Donald Tsang
(2005-2012)
June 2005[3] July 1992 to May 2005 155 months
July 2005[3] July 1992 to June 2005… 156 months…
CY Leung
(2012-2017)
July 2012 July 1992 to June 2012 240 months
August 2012… July 1992 to July 2012… 241 months…
Carrie Lam
(2017-2022)
July 2017 July 1992 to June 2017 300 months
August 2017… July 1992 to July 2017… 301 months…
John Lee
(2022- )
July 2022… July 1992 to June 2022… 360 months…
June 2023 July 1992 to May 2023 371 months

[3] As the PSI was used only after 2012, the earlier values need to be computed in retrospect.



Understanding the Index Values

PSI, GA and SA values are all standardized to a normal distribution with the mean value set at 100 and standard deviation set at 15, similar to that of Intelligence Quotient (IQ), meaning that each number was transformed into another number fitting the normal curve described. The lower the value, the poorer the public sentiment is. The higher the value, the better the public sentiment is, while 100 means normal. Specific values can be interpreted using this table:

Value Percentile Value Percentile
140+ Maximum 1% 60- Minimum 1%
125 Maximum 5% 75 Minimum 5%
120 Maximum 10% 80 Minimum 10%
110 Maximum 25% 90 Minimum 25%
100 being normal level, meaning half above half below
No items found.
No items found.