[전문공개] Frauds in the Korea 2020 Parliamentary Election -Walter R. Mebane, Jr.
[전문공개] Frauds in the Korea 2020 Parliamentary Election -Walter R. Mebane, Jr.
  • 편집국
    편집국
  • 승인 2020.04.30 23:25
  • 댓글 22
이 기사를 공유합니다

21대 국회의원선거 사전투표 조작의혹 관련 논란이 증폭되고 있는 가운데, 관련 분야 세계적인 권위자로 평가받고 있는 한 전문가의 문건이 공개됐다. 

Walter R. Mebane, Jr.이라는 미국 미시건 대학교의 정치학과 교수가 작성한 "Frauds in the Korea 2020 Parliamentary Election" 라는  문건으로, 본지는 국민의 알권리 차원에서 문서를 입수하여 전문을 공개한다. 

월터 미베인 교수 (Walter R. Mebane, Jr)는 부정선거를 탐지하는 프로그램도 개발하여 지원하고 있으며, 최근까지 전세계 각국의 선거에서 부정의 요소들을 정확히 감지해 내는 문건을 여러차례 작성하는 등 이 분야 최고의 석학으로 꼽힌다.  

Frauds in the Korea 2020 Parliamentary Election∗

Walter R. Mebane, Jr.†                           

April 28, 2020

The statistical model implemented in eforensics1 offers evidence that fraudulent votes occurred in the election that may have changed some election outcomes.The statistical model operationalizes the idea that “frauds” occur when one party gains votes by a combination of manufacturing votes from abstentions and stealing votes from opposing parties.

The Bayesian specification2 allows posterior means and credible intervals for counts of “fraudulent” votes to be determined both for the entire election and for observed individual aggregation units. It is important to keep in mind that “frauds” according to the eforensics model may or may not be results of malfeasance and bad actions.

How much estimated “frauds” may be produced by normal political activity, and in particular by strategic behavior, is an open question that is the focus of current research. Statistical findings such as are reported here should be followed up with additional information and further investigation into what happened.

The statistical findings alone cannot stand as definitive evidence about what happened in an election.

Figure 1 shows the distribution of turnout and vote proportions across aggregation units.3 Each turnout proportion is (Number Valid)/(Number Eligible), and each vote proportion is (Number Voting for Party)/(Number Eligible).

The data include counts for n = 19072 units. 328 “abroad office” observations have zero eligible voters but often a small number of votes—the largest number is 23—and are omitted from the plots.

Figure 1(a)uses vote proportions defined based on Democratic Party votes, and Figure 1(b) uses vote proportions defined based on the votes received by the party with the most votes in each constituency.

Fraud allegations have focused on the Democratic Party, but a principled way to analyze the single-member district election data is to consider that frauds potentially benefited the eading candidate in each constituency.

In the the figure differences between the two distributions are apparent, but both share a distinctive multimodal pattern. There appear to be clusters of observations that hared istinctive levels of turnout and votes, some with low, medium, high and very high turnout.

The diagonal edge feature in the plots results from using Number Eligible as the denominator for both proportions: when the party receives nearly all the valid votes, then the observation is near that diagonal.

Figure 1: Korea 2020 Parliamentary Election Data Plots

                                                                                    (a) Democratic Party (b) Constituency leaders

 

Figures 2 and 3 show that the different clusters in Figure 1 correspond with observations that are administratively distinctive.

Figure 2 displays data for Democratic Party votes, and Figure 3 shows data for constituency leader votes. The four sets of units2 that have distinctive distributions are district-level, election-day units that are not abroad (Figures 2(a) and 3(a)), postal, election-day units (Figures 2(b) and 3(b)), abroad units (Figures 2(c) and 3(c)) and pre-vote units (Figures 2(d) and 3(d)). Each subset of units (a), (b) and (d) has a mostly unimodal distribution: the marginal histograms are mostly near symmetric. But exceptional points are evident in each of these subsets. Abroad units are more distinctively bimodal when constituency leaders are considered than when the Democratic Party is in focus.

Figure 2: Korea 2020 Parliamentary Election Data Plots, Democratic Party                                                            

 

Note: plots show turnout (number voting/number eligible) and vote proportions (number voting for Democratic party/number eligible) for four subsets of observations: (a) district-level, election-day, not abroad; (b) postal election-day; (c) abroad; (d) pre-vote. Plots show scatterplots with estimated bivariate densities overlaid, with histograms along the axes. 328 “abroad office” observations reported with zero eligible voters but often with a positive number of votes are omitted.

Figure 3: Korea 2020 Parliamentary Election Data Plots, Constituency Leaders

Note: plots show turnout (number voting/number eligible) and vote proportions (number voting for constituency-leading party/number eligible) for four subsets of observations: (a) district-level, election-day, not abroad; (b) postal election-day; (c) abroad; (d) pre-vote. Plots show scatterplots with estimated bivariate densities overlaid, with histograms along the axes. 328 “abroad office” observations reported with zero eligible voters but often with a positive number of votes are omitted.

 

I estimate the eforensics model separately for the two definitions of leading party votes. Covariates for turnout and vote choice include indicators for pre-vote, postal, abroad and disabled-ship status and fixed effects for the 252 constituencies included in the data.

The two specifications agree that 418 aggregation units are fraudulent, but 869 additional units are fraudulent in the Democratic party specification and 745 additional units are fraudulent in the constituency-leading party specification. As Table 1 shows, key parameter estimates are similar in the models. Parameters for the probabilities of frauds (π1, π2, π3) are about the same between specifications, and coefficients for the turnout equation (τ1–τ5) are similar. Coefficients for vote choice (β1–β4) differ, reflecting the differences in vote proportions being modeled.

Figure 4 uses plots by subset of Democratic party focused observations to illustrate which observations are fraudulent according to the eforensics model with the Democratic party focused specification. Nonfraudulent observations are plotted in blue and fraudulent observations appear in red. The frequencies of fraudulent and not fraudulent units appear in the note at the bottom of the figure. Visually and by the numbers, frauds occur most frequently for pre-vote units (43.1% are fraudulent), next most frequently for for district-level, election-day, not abroad unts (3.14% fraudulent) then next most frequently postal election day units (.925% are fraudulent). None of the abroad units are fraudulent.

Figure 5 uses plots by subset of constituency-leader focused observations to illustrate which observations are fraudulent according to the eforensics model with the constituency-leader focused specification. Nonfraudulent observations are plotted in blue and fraudulent observations appear in red. The frequencies of fraudulent and not fraudulent units appear in the note at the bottom of the figure. Visually and by the numbers, frauds occur most frequently for pre-vote units (22.6% are fraudulent), next most frequently for postal election day units (2.09% are fraudulent) then next most frequently for district-level, election-day, not abroad unts (.920% fraudulent). None of the abroad units are fraudulent.

Table 1: Korea 2020 Parliamentary eforensics Estimates

 

Figure 4: Korea 2020 Fraud Plots , Democratic Party

Note: plots show turnout (number voting/number eligible) and vote proportions (number voting for Democratic Party/number eligible) for four subsets of observations: (a) district-level, election-day, not abroad (10 fraudulent, 318 not); (b) postal election-day (131 fraudulent, 14155 not); (c) abroad (0 fraudulent, 328 not); (d) pre-vote (1146 fraudulent, 2656 not). Plots show scatterplots with nonfraudulent observations in blue and fraudulent observations in red. 328 “abroad office” observations reported with zero eligible voters but often with a positive number of votes are omitted.

Figure 5: Korea 2020 Fraud Plots , Constituency Leaders

Note: plots show turnout (number voting/number eligible) and vote proportions (number voting for constituency-leading party/number eligible) for four subsets of observations: (a) district-level, election-day, not abroad (5 fraudulent, 323 not); (b) postal election-day (298 fraudulent, 13988 not); (c) abroad (0 fraudulent, 328 not); (d) pre-vote (860 fraudulent, 2942 not). Plots show scatterplots with nonfraudulent observations in blue and fraudulent observations in red. 328 “abroad office” observations reported with zero eligible voters but often with a positive number of votes are omitted.
 

I use a counterfactual method to calculate how many votes are fraudulent.4 Table 2 reports the observed counts of eligible voters, valid votes and votes for the (a) Democratic party and (b) constituency-leading party totaled over all units in the analysis, along with fraudulent vote count totals. The total of “manufactured” votes is reported separately from the total number of fraudulent votes: manufactured votes are votes that the model estimates should have been abstentions but instead were observed as votes for the leading party.

Both posterior means and 95% and 99.5% credible intervals are reported. The results show that for the Democratic Party focused specification over all about 1,491,548 votes are fraudulent, and of the fraudulent votes about 1,122,169 are manufactured (the remaining 369379 are stolen—counted for the leading party when they should have been counted for a different party).

Overall, according to the eforensics model, about 10.43% of the votes for the Democratic Party candidates are fraudulent. The results show that for the constituency-leading focused specification over all about 1,171,734 votes are fraudulent, and of the fraudulent votes about 910,444 are manufactured (the remaining 261,290 are stolen—counted for the leading party when they should have been counted for a different party). Overall, according to the eforensics model, about 7.26% of the votes for the constituency-leading candidates are fraudulent.

Fraudulent vote occurrence varies over constituencies.

Counts of frauds by aggregation unit appear in a supplemental file5, but I use the unit-specific fraudulent vote counts from the constituency-leader focused specification to assess whether the number of fraudulent votes is ever large enough apparently to change the winner of a constituency contest. For 236 constituencies it is not, but for 16 constituencies the number of fraudulent votes is large enough apparently to change the winner of the constituency contest. In 9 instances the apparently fraudulently winning party is the Democratic Party, in 6 instances it is the United Future Party and in the remaining instance it is an Independent candidate.

Given two specifications, which one is better?

Probably neither model is correct, strictly speaking, ven beyond the generality that no model is ever correct, but some are useful. If frauds only ever benefit the Democratic Party, then those frauds may have induced apparent frauds when we constrain frauds to benefit only constituency-leading candidates, because many of these do not affiliate with the Democratic Party.

Table 2: Korea 2020 eforensics Estimated Fraudulent Vote Counts

Similarly if only constituency-leading candidates benefit from frauds, then eforensics may be producing misleading results when we constrain frauds to benefit only the Democratic Party. Or perhaps other candidates—or several in each constituency—benefit from frauds and both specifications are producing misleading results. Possibly, of course, there are no frauds and something else is going on.

Caveats are many. The most basic caution is to keep in mind that “frauds” according to the eforensics model may or may not be results of malfeasance and bad actions.

If some normal political situation makes the apparently fraudulent aggregation units appear fraudulent to the eforensics model and estimation procedure, then the frauds estimates may be signaling that “frauds” occur where in fact something else is happening. In particular there maybe something benign that leads many of the pre-vote units to have a turnout and vote choice distribution that differs so much especially from the distribution for election-day postal units, the latter comprising the bulk of the data.

Likewise something benign may distinguish the election-day postal units that the eforensics model identifies as fraudulent.

Beyond that general caution, there may something about the particular data used for the analysis that triggers the “fraud” findings—for instance, the data appear to be missing about 100,000 votes and one entire constituency, and the vote totals in the data for constituency-leading candidates do not always match totals reported in “lists of winners.”

And there may be something about the model specification that should be improved that would produce different results.

Statistical findings such as are reported here should be followed up with additional information and further investigation into what happened. The statistical findings alone cannot stand as definitive evidence about what happened in the election.

--------------------------------------------------
References
Ferrari, Diogo, Kevin McAlister and Walter R. Mebane, Jr. 2018. “Developments in Positive
Empirical Models of Election Frauds: Dimensions and Decisions.” Presented at the 2018
Summer Meeting of the Political Methodology Society, Provo, UT, July 16–18.    (문건 끝)

 

Software Available for Downloading, with Documentation

Election Forensics R Package (eforensics tarball) and (eforensics GitHub). Diogo Ferrari, Kevin McAlister, Walter Mebane and Patrick Wu, 2019.

Robust Estimation Software (multinomRob). Walter R. Mebane, Jr., and Jasjeet S. Sekhon, 2003.

Genetic Optimization Using Derivatives for R (RGENOUD). Walter R. Mebane, Jr., and Jasjeet S. Sekhon, 2001. (The ancestral GENOUD C program from 1997 is here.)

Genetic Optimization and Bootstrapping of Linear Structures (GENBLIS). Walter R. Mebane, Jr., and Jasjeet S. Sekhon, 1998.

Papers Available for Downloading

Walter R. Mebane, Jr. 2020. `` Frauds in the Korea 2020 Parliamentary Election''

Walter R. Mebane, Jr. 2019. `` Evidence Against Fraudulent Votes Being Decisive in the Bolivia 2019 Election''

Walter R. Mebane, Jr. 2019. `` eforensics: A Bayesian Implementation of A Positive Empirical Model of Election Frauds''

Patrick Y. Wu, Walter R. Mebane, Jr., Logan Woods, Joseph Klaver, and Preston Due. 2019. `` Partisan Associations of Twitter Users Based on Their Self-descriptions and Word Embeddings'' Prepared for presentation at the 2018 Annual Meeting of the American Political Science Association, Washington, DC, Aug 29--Sep 1.   외 다수 

 

Fn투데이는 여러분의 후원금을 귀하게 쓰겠습니다.



댓글삭제
삭제한 댓글은 다시 복구할 수 없습니다.
그래도 삭제하시겠습니까?
댓글 22
댓글쓰기
계정을 선택하시면 로그인·계정인증을 통해
댓글을 남기실 수 있습니다.
이병태머리 2020-05-07 23:05:27 (125.186.***.***)
이병태
머리는 가발인가
아닌가?

곱슬머리 가발이
직모가발보다 더 매력있을
확률은?

이런 걸 공부하는 것이 통계학이다.
김경희 2020-05-04 08:24:41 (211.226.***.***)
불법으로 사전선거에 QR코드사용, 현재 45개 선거구 증거보전신청중 10개 보전신청 인용, 8개 집행, 기각0인데다 세계적 정치통계부정선거 석학교수의 한국 총선은 사기다 라는 논문이 실린 상황에서도 기사화 되는 내용 없이 언론통제되고있으며. 선관위는 민경욱 지역구서 사전 비례대표표와, 인명부 내주지 않고,대전 김소연 대표 지역구 투표함 봉인지 훼손후 판사 판결에도 내주지 않고있는가? 각종 통계는 둘째치고 우리나라서 315부정선거 이래로 증거보전신청이 45개 지역구서 발생된적이 있는가? 이게 기사 한줄이 안나는 지금이정상인가?

그나마 파이낸스에 경의를 표합니다!!!
송석민 2020-05-04 06:15:39 (124.49.***.***)
월터 미베인교수가 찾아낸 부정선거 국가들 이란 2009, 터키 2015, 러시아 2016, 온두라스 2017, 콩코 2018, 케냐 2018, 이라크 2018, 볼리비아 2019, 한국 2020 ( 백악관 청원에 동참해주세요. https://petitions.whitehouse.gov/petition/petition-south-korea-elections-rigged-deliberately-ruling-party-and-moon-jae 이름,성, 이메일 적고 가입하시면 이메일로 날아온 confirm your signiture 클릭.
오준호 2020-05-03 12:54:02 (175.207.***.***)
이게 진실이다.... 아 공포 스럽다... 이나라 이정권... 진실을 알리는 매체가 여기 뿐이라니... 진짜 ....
ps 2020-05-03 02:48:57 (114.200.***.***)
21대 부정선거에 대해 말하는 한국언론이 하나도 없는 줄 알았는데 여기 유일한 한곳이 있었군요
한국언론이 이를 철저하게 숨기는 것이 문정권이 독재정권, 공산정권이란 증거겠죠
부정선거로 다음 대선에도 민주당 승리는 이미 결정이 난겁니다 이미 4번 연속 부정선거로 이긴 결과가 나왔으니 말입니다 드루킹은 초기 작품이고 갈수록 조작선거가 진화된다는 느낌을 받습니다 정말 무서운 한국의 문재인정권입니다 좌파정권이 장기화되면 한국의 공산화는 막을 길이 없을 것 같습니다;
이경규 2020-05-02 16:59:41 (125.186.***.***)
위키피디아에
아프리카의 여러나라들과 함께
케이스가 올라가게 되어서
위대하신 문정부의 커다란 공덕이니

눈알이 안구진탕을 일으켜
팽글팽글 돈다.
Jay 2020-05-02 14:09:13 (125.186.***.***)
통계는 거짓을 하지 않는다.

이번 사전선거에서
진보는 성미가 급해서
미리 투표하고
보수는 느긋해서
당일투표 했다는 이상한
논리를 펴는 사람이 있던데

금,토의 사전선거는 직장에 안 다니는사람
그리고 코로나 피해서 미리 투표한 사람이 많았다.
보수아재 2020-05-02 13:17:34 (175.223.***.***)
논문 마지막 결론 “통계적 결과만으로는 선거에서 무슨 일이 일어났는지에 대한 확실한 증거가 될 수 없다 “ 막 퍼나르다 망신당하겟네 ㅋㅋ
은수미 2020-05-02 12:57:30 (39.7.***.***)
대한민국에 단 하나 남은 진정한 언론사. 너무 감사합니다. 끝까지 파헤쳐주세요!
Go 2020-05-02 12:00:02 (118.235.***.***)
진정한 언론 파이낸스 투데이!! 기사 감사합니다 진실은 밝혀집니다

  • 제호 : 파이낸스투데이
  • 서울시 서초구 서초동 사임당로 39
  • 등록번호 : 서울 아 00570 법인명 : (주)메이벅스 사업자등록번호 : 214-88-86677
  • 등록일 : 2008-05-01
  • 발행일 : 2008-05-01
  • 발행(편집)인 : 인세영
  • 청소년보호책임자 : 장인수
  • 본사긴급 연락처 : 02-583-8333 / 010-3797-3464
  • 법률고문: 유병두 변호사 (前 수원지검 안양지청장, 서울중앙지검 , 서울동부지검 부장검사)
    최기식 변호사 (前 서울고등검찰청 부장검사, 대구지방검찰청 제1차장검사, 수원지방검찰청 성남지청 차장검사)
  • 파이낸스투데이 모든 콘텐츠(영상,기사, 사진)는 저작권법의 보호를 받은바, 무단 전재와 복사, 배포 등을 금합니다.
  • Copyright © 2022 파이낸스투데이. All rights reserved. mail to news1@fntoday.co.kr
ND소프트 인신위