Introduction

Wikipedia, hosted since 2003, has become a significant repository of knowledge, encompassing a wide array of historical events and narratives. The objective of this project is to analyze and measure narrative changes in Wikipedia articles related to civil rights movements in Thailand from the 1970s to the present using natural language processing (NLP) techniques, particularly through word cloud visualization and sentiment analysis. This project aims to uncover significant shifts in how these historical events are portrayed and perceived over time.

However, it is essential to acknowledge the inherent limitation of using Wikipedia for historical narrative analysis. Wikipedia articles serve as secondary documentation, synthesized from various sources, rather than primary documents created contemporaneously with the events they describe. This project thus explores the narratives constructed within these articles, reflecting broader societal perceptions and historical interpretations.

Through the lens of word cloud visualization, the project will explore the themes of each key event, highlighting prominent terms and their evolution across decades. Additionally, sentiment analysis provides deeper insights into the emotional tones embedded within these narratives, shedding light on societal perceptions and the impact of political interventions on civil liberties. Therefore, this project seeks to provide insights into how historical memory and societal discourse evolve and are represented in a digital platform like Wikipedia.

Data Collection

Overview of Collected Articles

The followings are a list of Wikipedia articles related to key civil rights events in Thailand from the 1970s to the present, organized by decade:

1970s

🔗 1973 Student Uprising: On October 14, 1973, mass student protests led to the downfall of the military dictatorship and the establishment of a democratic government. This event marked a significant victory for civil rights and democratic movements in Thailand.
🔗 1976 Thammasat University Massacre: On October 6, 1976, right-wing paramilitary groups and police forces attacked students protesting the return of former military dictator Thanom Kittikachorn, resulting in a massacre and a severe crackdown on civil rights.

1990s

🔗 1992 Black May: In May 1992, mass protests erupted against the military-led government of General Suchinda Kraprayoon. The violent crackdown on protesters led to numerous deaths and injuries, eventually resulting in Suchinda's resignation and significant civil rights reforms.

2000s

🔗 2006 Military Coup: On September 19, 2006, the military ousted Prime Minister Thaksin Shinawatra in a coup, citing corruption and abuse of power. This event led to a prolonged period of political instability and civil rights challenges, including restrictions on free speech and assembly.

2010s

🔗 2010 Red Shirt Protests: The United Front for Democracy Against Dictatorship (UDD), known as the Red Shirts, staged large-scale protests demanding the resignation of Prime Minister Abhisit Vejjajiva. The government's violent crackdown on the protests resulted in numerous deaths and highlighted significant civil rights abuses.
🔗 2014 Military Coup: On May 22, 2014, the military, led by General Prayuth Chan-ocha, staged another coup, citing the need to restore order after prolonged political turmoil. The coup led to severe restrictions on civil rights, including freedom of expression, assembly, and the press.

2020s

🔗 2020-2021 Pro-Democracy Protests: Widespread protests, largely led by youth and student groups, called for democratic reforms, including changes to the monarchy. The government responded with severe crackdowns, arrests, and charges against protesters, including the use of the lese-majeste law to stifle dissent.

Data Scraping

Scrape data from the selected Wikipedia articles:

# Install and load necessary packages
install.packages(c("rvest",
                   "xml2", 
                   "dplyr", 
                   "tokenizers", 
                   "tm", 
                   "SnowballC", 
                   "topicmodels", 
                   "ggplot2", 
                   "tidyverse", 
                   "wordcloud"))

library(rvest)
library(xml2)
library(dplyr)
library(tokenizers)
library(tm)
library(SnowballC)
library(topicmodels)
library(ggplot2)
library(tidyverse)
library(wordcloud)

# Define URLs
urls <- c(
  "<https://en.wikipedia.org/wiki/1973_Thai_popular_uprising>",
  "<https://en.wikipedia.org/wiki/6_October_1976_massacre>",
  "<https://en.wikipedia.org/wiki/Black_May_(1992)>",
  "<https://en.wikipedia.org/wiki/2006_Thai_coup_d%27%C3%A9tat>",
  "<https://en.wikipedia.org/wiki/2010_Thai_political_protests>",
  "<https://en.wikipedia.org/wiki/2014_Thai_coup_d%27%C3%A9tat>",
  "<https://en.wikipedia.org/wiki/2020%E2%80%932021_Thai_protests>"
)

# Data Scraping
article_content <- list()

for (url in urls) {
  webpage <- read_html(url)
  article_title <- html_text(html_nodes(webpage, "h1"))
  article_paragraphs <- html_text(html_nodes(webpage, "p"))
  article_content[[url]] <- list(title = article_title, content = article_paragraphs)
}

# View scraped data
print(article_content)

Data Cleaning and Preparation

Text Processing: Tokenization, Stopword Removal, Stemming

Data Structuring: Converting unstructured data into a data frame

# Data Cleaning and Preparation
cleaned_texts <- list()

for (url in names(article_content)) {
  paragraphs <- article_content[[url]]$content
  text <- paste(paragraphs, collapse = " ")
  tokens <- tokenize_words(text)
  tokens <- unlist(tokens)
  tokens <- tokens[!tokens %in% stopwords("en")]
  tokens <- wordStem(tokens, language = "en")
  cleaned_text <- paste(tokens, collapse = " ")
  cleaned_texts[[url]] <- list(title = article_content[[url]]$title, content = cleaned_text)
}

# View cleaned data
print(cleaned_texts)

# Convert to Data Frame
structured_data <- data.frame(
  url = character(),
  title = character(),
  content = character(),
  stringsAsFactors = FALSE
)

for (url in names(cleaned_texts)) {
  article <- cleaned_texts[[url]]
  structured_data <- rbind(
    structured_data,
    data.frame(
      url = url,
      title = article$title,
      content = article$content,
      stringsAsFactors = FALSE
    )
  )
}

# View structured data
View(structured_data)

Visualization and Analysis

Word Cloud Generation

Word clouds are generated to illustrate narrative shifts. By comparing the word clouds of each article, the thematic shifts across different decades can be identified.

# Word Cloud Generation

# Install and load necessary package
install.packages("wordcloud")
library(wordcloud)

# Create a function to generate and plot word clouds
generate_wordcloud <- function(text, title) {
  word_freq <- table(unlist(strsplit(text, "\\\\s+")))
  wordcloud(words = names(word_freq), freq = word_freq, max.words = 100,
            random.order = FALSE, colors = brewer.pal(8, "Dark2"),
            main = title)
}

# Generate and plot word clouds one by one
for (i in 1:nrow(structured_data)) {
  dev.new() 
  generate_wordcloud(structured_data$content[i], structured_data$title[i])
  readline(prompt = "Press [enter] to see the next word cloud...")
}

Results

Untitled

Interpretation and Insights

Narrative Shifts Across Decades

1970s: Student Uprising and Thammasat University Massacre

The word cloud for the 1973 Student Uprising emphasizes terms like "students," "university," "government," "military," and "rally," highlighting a grassroots movement against authoritarian rule and the establishment of democratic governance.
In contrast, the 1976 Thammasat University Massacre focuses prominently on terms related to Thai monarchy and the tragic events of October, underscoring the societal upheaval and state repression faced by student activists.

1990s: Black May

The 1992 Black May article portrays a narrative dominated by terms such as "protest," "people" and "government," indicative of widespread public protests demanding democratic reforms and an end to military influence.

2000s: Military Coup and Political Instability

The 2006 Military Coup word cloud features terms such as "coup," "Thaksin," and "army," illustrating a period marked by political turmoil and military intervention in governance, highlighting challenges to democratic norms and freedoms.

2010s: Red Shirt Protests and Subsequent Coup

The 2010 Red Shirt Protests article reflects terms like "protest," "government," "red shirt," and "Bangkok," illustrating mass demonstrations against perceived electoral injustices and highlighting clashes with security forces.
Following the protests, the 2014 Military Coup word cloud reveals terms such as "NCPO," "coup," "military," "law," and "2014," indicating a period of heightened authoritarianism and restrictions on civil liberties under military rule.

2020s: Pro-Democracy Protests and Calls for Reform

The 2020-2021 Pro-Democracy Protests exhibit a narrative focused on terms such as "protest," "police," "monarchy," and "demand," underscoring widespread calls for democratic reforms, transparency, and accountability from government institutions.