Machine Learning Algorithms Dive Into GE14 Election Battle On Twitter-Verse

Article May 19, 2021

OVER the last couple of years, the power of data analytics has been cast in a negative albeit powerful limelight as a social-shaping, highly impactful kingmaker tool in a nation’s election. Think Cambridge Analytica and the slew of claims of election rigging and data privacy violation. The power of data analytics, machine learning algorithms have been used to predict election results. Are you aware of the use of Twitter sentimental analysis?


Taking pride as a Malaysia-grown regional enterprise, the Center of Applied Data Science (CADS) seized the opportunity of the 14th Malaysian General Election to showcase how data analytics can leverage upon morally-ethical data sources and still produce highly-impactful, game-changing insights that can ultimately be used to decide the winning coalition in the Malaysian GE 14.

Since the Cambridge Analytica scandal hit the infamous limelight, the #DeleteFacebook movement made slow but steady progress among the Western regions. Although not severely impacted, the Malaysian Facebook population grew vary of using Facebook to discuss GE14 related matters. This left Twitter as a dominate source of data & input for conducting any GE14 sentiment analysis.

Between the March 1 and April 30, 2018, we collected 118,491 tweets involving “General Election Malaysia” and “GE14” contents, which became the basis of our “Malaysia’s GE14 Sentiment Analysis” study.

We then employed machine learning algorithms as well as data collection and cleansing, sentiment analysis, online behaviour analysis, topic modelling and influencers’ network detection in generating actionable insights.

Our study focused on trying to understand if it could pick up the salient sentiments of the Malaysian population using Twitter-verse as the proxy environment and potential influencing individuals or factors that could contribute to the sentiment polarity from the GE14 environment.

A heat map with the key words, both positive and negative, associated with Malaysia’s GE14.

From the get-go, we were able to identify one coalition clearly leading the way in leveraging social media to develop the sentiment polarity compared to the other coalition groups.

While we wish to provide a detailed report pertaining to our study, to ensure our report does not critically sway or decide the upcoming elections, we will reserve publishing the full report for now. However, we will provide a precursory view of some key sentiments raised by our study and are sharing the heat map of key words, both positive and negative, associated with Barisan Nasional, Pakatan Harapan and Parti Agama Semalaysia (see infographic above).

Bolstered with a strong 76% accuracy level corroborated by our internal validation team, we were able to establish that the “stickiness” of the election engine for the Barisan National coalition, the Pakatan Harapan coalition and the Parti Agama Semalaysia (PAS) coalition each depended on different trigger points when it came to garnering support or even establishing defences.

For example, the Barisan Nasional coalition depended on standard influencer-generated “I Love My PM” and manifesto-related postings that were not able to reach out the wider population of Malaysia.

Pakatan Harapan coalition, however, picked up on the “burden of the Rakyat” sentiment which was easy to illicit content creation from many influencers that believed that they were creating content from the Rakyat’s perspective.

Combined with the constant upheaving of scandal-overtoned postings from the negative side of the Twitter-verse, Barisan Nasional had an uphill battle in turning around the sentiment despite having a strong track record in governmental performance.

This is data-supported evidence that reveals “Current Sentiment Crushes Past Performance” is indeed a real factor to be considered by any election engine.

Another interesting insight has been the unintended biasness that certain foreign media have created in GE14. With increasing scepticism towards Malaysian local media, the Pakatan Harapan coalition, which owns a commanding dominance in the usage of digital social media as an election engine, has built a reliance on foreign publications and media to build the required image to support their election campaign.

While the Barisan Nasional coalition has tried to sway sentiments by providing a more accurate representation of Malaysia’s economic performance over the years, the perception doggedly remains that this is simply another focus-shifting effort from the ruling coalition.

Built on this insight, a more emotion-recognising election campaign and strategy would have elicited more support for Barisan Nasional compared to a track-record and performance-benchmarking focus.

A final insight from our study, as we start gearing up for the final stretch of Malaysia’s GE14, is the shortening of impact and “campaign-memory” of the Twitter-verse.

From a nonchalant attitude to the GE14 prior to the dissolution of Parliament, it is evident that the Malaysian public is becoming increasingly sensitive to “sentiment triggers” irrespective from which coalition group they come from.

From a week-long gap before sentiments switch pre-dissolution, in the first week of May, it would take only two concentrated digital campaigning or a major event incidence for a coalition party to slowly begin a general sentiment overturn.

Keeping this in mind, the election engines of both coalitions and other standalone parties will need to play a calculated game of anticipation to see who can pull off the right content and impact that will follow voters into polling booths on May 9, 2018.

While our research is laden with several more, deeply intriguing social pattern insights following this study, we wish to preserve our unbiasedness in the GE14 by not publishing the full report during the campaigning period.

However, our doors will always be open, after the conclusion of Malaysia’s GE14, to help anyone and everyone uncover the power of Data Science & Analytics in shaping the world of tomorrow! Do contact us.

This is just one example of how example Data Science & Analytics application can literally disrupt, shape and decide the world of tomorrow today!

Sharala Axyrd is the CEO of CADS (The Center of Applied Data Science), ASEAN’s first and only one-stop platform and center of excellence for Data Science.

*This article was first published on Digital News Asia.



Max is the official mascot for CADS or the Center of Applied Data Science.