top of page

Thailand’s First Data-Driven Election Forecast Model

  • thaidatapointscom
  • 3 days ago
  • 11 min read

A Technical Companion and Scenario Analysis

Joel Sawat Selway


Shorter version can be found at LatitudeTen


Image generated by ChatGPT


From Narrative Forecasts to Explicit Political Assumptions

Most Thai election “forecasts” are not models in the statistical sense. They are narratives: extrapolations from recent polls, reputational judgments about parties, and informal assessments of candidate strength. Different analysts often disagree not because they see different data, but because they implicitly assume different political logics about how elections work.


This project formalizes those assumptions.


Rather than asking “Who will win?”, the model asks a more precise question: If Thailand behaves according to a given political logic, what seat outcomes follow?

The answer is not a single number, but a range of outcomes conditional on assumptions.

 

The Core Modeling Strategy

Training Data: 2023 as the Learning Environment

All models are “trained” on the 2023 general election, using constituency-level outcomes. Each observation is a candidate–constituency pair. The dependent variable is a simple binary {won|lost}. The training phase estimates the probability that a candidate wins depending on personal characteristics such as election experience, incumbency status, and whether they switched parties between 2019-2023; party characteristics, including which party they run under, and how the party performed at the national, regional, and provincial level; and constituency characteristics, such as the effective number of parties that competed in 2023 and the winner’s margin of victory.

 

The Inputs

Before turning to results, it is useful to clarify how the model’s variables are structured conceptually. Rather than throwing predictors into a regression mechanically, the variables are grouped to correspond to distinct theories of how Thai elections operate.

The full list of variables are listed in Table 1.

Variable

Variable Families

Interpretation

Sitting MP Status

Candidate-Level

Incumbency advantage

Ran in previous election

Candidate-Level

Prior ballot exposure. Regardless of whether a candidate won or lost.

First Time Candidate

Candidate-Level

Electoral inexperience

Party Switching

Candidate-Level

Changed party since last election

Stable Partisan

Candidate-Level

Stayed with same party between 2019-2026. Long-term party loyalty

Constituency Structure

Constituency Structure

Fragmentation / competitiveness

Winner Share

Constituency Structure

Past dominance in a seat

Party Strength (National)

Party Strength

Party-list vote strength. Indicated by Jan 30 poll in 2026 model.

National Vote Change

Party Strength

What is a party polling nationally compared to how they did in the previous election?

Party Strength (Regional)

Party Strength

Regional vote density. How well the party did across the entire region (six: North, Northeast, East, Central, Bangkok, and South)

Party Strength (Provincial)

Party Strength

Province-level vote density

Democrat South

Territorial Interactions

Democrat southern stronghold

Pheu Thai North, Northeast

Territorial Interactions

Pheu Thai regional bases in North and Northeast

Kla Tham Northern cluster, South

Territorial Interactions

Kla Tham northern pocket in Phayao and parts of Chiang Rai and Nan; and in the South

BJT Northeast concentration

Territorial Interactions

BJT northeast dominance (especially in Southern Isan)

Table 1. Model Variables by Conceptual Category

 

Model Form

All specifications use logistic regression with clustered standard errors at the constituency level.


Once coefficients are estimated, they are stored and then applied—unchanged—to the 2026 candidate field. So, consider a candidate running 2026. We have data on whether that candidate switched parties, who they are running for, what the party is currently polling at in national and regional elections, and what constituency they are running in. The model simply applies the patterns from 2023 to those same variables. So, if switching parties reduced your chances of winning a seat in 2023, then a 2026 candidate who switched parties between 2023 and 2026 will be applied that same penalty. If polling well regionally increased a candidate’s chance of winning in 2023, then the model takes how the party is polling regionally in 2026 and applies the same advantage to all candidates from that party. The model “projects 2023 patterns to 2026 candidates on every single variable in the model.


Put differently, what changes in 2026 are the inputs, not the rules:

  • party polling replaces past vote shares,

  • candidate status updates (new, switched, incumbent),

  • constituency structure remains fixed from 2023.

 

Examples

To make this intuitive, consider several stylized candidate types:

  • Candidate A: Won in 2023, did not switch

→ Receives incumbency advantage + stable partisan bonus + constituency familiarity

  • Candidate B: Ran and lost in 2023

→ Gains prior-candidacy recognition but no incumbency boost

  • Candidate C: First-time candidate

→ Receives an experience penalty regardless of party

  • Candidate D: Won in 2023 but switched parties

→ Incumbency advantage partially offset by switching penalty

  • Candidate E: Lost in 2023 and switched

→ Suffers both lack of incumbency and switching penalty

  • Candidate F: Did not run in 2023 but ran earlier with same party

→ Gains historical recognition without switching cost

  • Candidate G: Ran previously and switched

→ Treated as a realignment candidate: rewarded only in candidate-centered models

 

How Predictions Are Generated

After every single one of the ~3,500 candidates are assigned a probability of winning in their constituency, the model then simply selects who has the highest probability and awards them the seat. Seats are then tallied at the national level for each party. Each of the 23 models that I ran had their own unique seat predictions. I then grouped them by model family to produce a range of seat outcomes for each party.

 

The Four Model Families

Group 1: Nationalized Models

Core logic: Here the core logic is that voters respond primarily to national party brands rather than local or candidate-specific factors. National polling performance carries heavy weight in determining constituency outcomes, while local machines, personal vote, and constituency-level variation play only a limited role. Under this logic, elections behave like a single national contest replicated across hundreds of seats.

These models closely resemble forecasts that extrapolate directly from national polling.

Who benefits: PP

Who loses: DP, KT

Pattern: Uniformity overwhelms structure

 

Group 2: Regional–Provincial Models

Core logic: Here, territorial politics remain decisive. Regional and provincial vote strength play a central role, allowing parties with strong geographic bases to outperform their national polling. Under this logic, Bhumjaithai and the Democrats surge by converting localized strength into seats, while the People’s Party loses its uniform dominance as support varies sharply across regions.

This family best reflects Thailand’s long-standing pattern of regionalized voting.

Who benefits: BJT, DP, PT

Who loses: PP (compression)

Pattern: Geography beats brand

 

Group 3: Candidate-Centered Models

Core logic: In this scenario, who runs matters more than the party label they carry. Incumbency, party switching, and prior candidacy experience are heavily emphasized, rewarding well-known local figures and established machines. Kla Tham becomes electorally viable in this environment, and smaller or “other” parties expand sharply as candidate reputation overrides national branding.

These models resemble elections dominated by local brokers and personal machines.

Who benefits: KT, “Other” parties

Who loses: PP

Pattern: Machines beat movements

 

Group 4: Hybrid Models

Core logic: This model assumes a mixed political environment in which no single force dominates. National mood, regional structure, and candidate characteristics all matter simultaneously. The result is the widest dispersion of outcomes across parties and the highest likelihood of a fragmented parliament, where multiple political logics coexist rather than one overwhelming the others.

These are arguably the most realistic—and the most politically unstable—scenarios.

Pattern: Constraint everywhere, dominance nowhere

Who benefits: BJT, PT, DP, “Other” parties

Who loses: PP (no longer consistently dominant), KT (cannot fully break through without candidate-centered dominance)

 

The Results

Below are the results across four families of models, each corresponding to a different political logic.

Party Name

Group 1: Nationalized Models

Group 2: Regional–Provincial Models

G3: Candidate-centered Models

G4: Hybrid Models

People’s Party (PP)

200 – 250

130 – 200

85 – 170

50 – 225

Pheu Thai(PT)

30 – 60

80 – 110

60 – 105

45 – 100

Bhumjaithai(BJT)

15 – 45

70 – 135

25 – 120

30 – 130

Kla Tham(KT)

0 – 5

5 – 30

25 – 45

10 – 40

Democrat(DP)

5 – 10

25 – 55

10 – 45

6 – 50

Other parties (≥10 seats combined)

10 – 20

25 – 50

30 – 60

20 – 55

Table 1. Seat ranges (min–max within each model group)


Each column in Table 1 represents a distinct political logic rather than a single forecast. Nationalized models reward parties with broad, uniform support (benefiting People’s Party most), while regional–provincial models favor territorially embedded parties such as Bhumjaithai and the Democrats. Candidate-centered models sharply increase the seat potential of Kla Tham, reflecting scenarios where party switching and local machines dominate. Hybrid models illustrate mixed environments, producing the widest dispersion and the highest likelihood of a fragmented parliament.

 

Party-Ranges


1. People’s Party (PP): High Ceiling, Fragile Floor (≈50–250 seats)

The People’s Party dominates only under models that assume a strong, uniform national swing translating cleanly into constituency victories. Once regional variation and candidate-level structure are allowed to matter—such as entrenched local machines, incumbency effects, and uneven party penetration—PP’s projected seat count compresses sharply. In these scenarios, national popularity no longer guarantees constituency success, revealing a party with an exceptionally high ceiling but a much more fragile floor than headline polling suggests.

Technically: PP’s coefficient on national polling share is large, but it does not interact well with ENP or provincial density. What this means is that PP’s variance is driven almost entirely by assumption sensitivity. Its vote is highly elastic to nationalization. Once constituency structure reasserts itself, PP’s conversion rate from votes to seats collapses faster than any other major party.

Takeaway: PP’s success depends on sustaining a nationalized election. Any return to localism cuts deeply into its advantage.

 

2. Pheu Thai (PT): Deep Roots, Shrinking Dominance (≈30–110 seats)

Pheu Thai never truly collapses in any of the models—but it also rarely dominates outright. Its electoral strength is rooted less in national momentum than in durability: deep regional strongholds in the North and Northeast, dense and well-established candidate networks, and a long record of historical constituency performance. As a result, PT’s seat totals tend to compress into a relatively narrow band across models, rising when regional structure matters and falling when nationalized waves overpower local loyalties.

Technically: PT’s resilience comes from low variance across models. Even when national strength weakens, PT’s regional base keeps it afloat. This is a party whose seat production is driven by geographic depth, not national momentum.

Takeaway: PT is the anchor party of Thai electoral politics. It benefits most when elections are not purely nationalized.

 

3. Bhumjaithai (BJT): The Ultimate Swing Party (≈15–135 seats)

Across nearly all model families, Bhumjaithai maintains a notably high lower bound, making it one of the most structurally resilient parties in the forecast. It performs best in scenarios where provincial strength matters, candidate history is weighted heavily, and national swings are muted. In other words, BJT thrives when elections resemble a patchwork of local contests rather than a single national wave—allowing its entrenched provincial networks to convert fragmentation into seats.

Technically: BJT’s defining feature is floor strength. Its lower bound remains high because it is competitive across multiple logics: provincial, candidate-centered, and hybrid. It is the only party that never depends on a single assumption.

Takeaway: BJT is structurally insulated from volatility. It is the party least dependent on a single electoral logic.

 

4. Democrat Party (DP): Regional Survival, National Uncertainty (≈6–60 seats)

The Democrats survive only in models that allow regional identity—especially the South—to matter. When elections are fully nationalized, DP collapses almost entirely, crowded out by larger national brands. When regional structure is preserved, however, the party rebounds sharply, reclaiming a meaningful bloc of seats anchored in long-standing southern loyalties. DP’s fate is therefore a direct barometer of whether local identity still constrains national polarization.

Technically: DP is the clearest example of structural dependence. Its survival hinges entirely on whether southern identity remains electorally meaningful. In models where regional effects are muted, DP disappears.

Takeaway: DP’s fate depends entirely on whether the South remains politically distinct.

 

Kla Tham (KT): Breakthrough Contingent on Alignment (≈1–45 seats)

Kla Tham ranges from near zero to more than 40 seats, entirely depending on the political assumptions driving the model. It succeeds only when candidate history is emphasized, party switching is rewarded rather than punished, and local machines dominate over national trends. In these scenarios, KT functions less like a conventional party and more like a vehicle for entrenched local elites. When elections instead behave as nationalized contests, KT’s seat share collapses, underscoring how contingent its fortunes are on constituency-level politics.

Technically: Kla Tham is the most contingent party in the system. It succeeds only when candidate history is emphasized, party switching is rewarded, and constituency fragmentation is high.

Takeaway: KT’s future is not about national popularity—it is about who runs, where, and with what local backing.

 

Best Predictions & What Must Be True for Each Outcome

Table 2 averages the seat prediction ranges for each party. The average range should not be read as a point forecast. Instead, it is a deliberately model-agnostic expectation that answers a more fundamental question: if we do not know which political logic will dominate the 2026 election, where do parties tend to land on average across competing assumptions?


By construction, this measure dampens extreme national-wave scenarios, penalizes parties whose success depends on only one modeling logic, and highlights parties with structurally resilient electoral machines such as Bhumjaithai and the Democrats. At the same time, it makes clear why the People’s Party exhibits the widest uncertainty band in Thai politics: its seat outcomes hinge almost entirely on whether the election behaves like a nationalized wave contest or fractures into hundreds of constituency-level battles with strong regional and candidate effects.

Party

Full Constituency Range (All Models)

Average Range Across Model Families

Party-List Seats

Best Prediction (Total Seats)

People’s Party (PP)

50 – 250

116 – 211

33

149 – 244

Bhumjaithai (BJT)

14 – 135

35 – 108

24

59 – 132

Pheu Thai (PT)

28 – 110

54 – 94

17

71 – 111

Democrat Party (DP)

6 – 58

12 – 40

13

25 – 53

Kla Tham (KT)

0 – 44

10 – 30

0

10 – 30

Other parties (≥10 seats in any model)

10 – 60

21 – 46

3

24 – 49

TOTAL

400

400

100

500

Table 2. Average ranges across four model types by party


This framework naturally leads to a clearer way of thinking about scenarios, rather than a single predicted outcome. Each party’s seat range maps onto a distinct political logic that could plausibly dominate the 2026 election. When one logic overwhelms the others, the system tilts sharply in a particular direction; when no logic wins out, fragmentation persists. The following scenarios summarize the main equilibrium paths implied by the models:


  • PP landslide: election is nationalized; candidate effects weaken; local machines erode

  • BJT plurality: provincial strength dominates; fragmentation across a large number of constituencies remains high

  • KT surge: party switching pays off; local incumbents matter more than brands

  • PT resurgence: regional strongholds hold; national wave softens

  • Fragmented parliament: no single logic dominates; “Other” parties gain 40–60 seats

 

Back to the Experts

Let’s return to the Thai Rath article that motivated this series of articles. Their ranges for PP were 100-130, which down weights the first family of models in this analysis, the nationalized models. Their predictions best fit in the candidate-centered and hybrid models.


Their range of 140-150 for BJT is above the high end of the model averages in Table 2. Only the regional-provincial model in this forecast came close to that (a high of 135).


Expert predictions for PT, 80-120, were very similar to the range the average of these models produced and almost perfectly fit the regional-provincial family of models.


Similarly for the DP, experts came very close to the average of these models and best fit the regional-provincial family of models.


Lastly, the predictions for KT were 40-70. In no model could I produce a KT outcome of more than 45.


In short, the experts implicitly seemed to be weighing most heavily on the logic of the regional-provincial model. However, their logic leans more toward the hybrid model and their predictions for the top five parties are also consistent with that model.

 

Why This Matters

Thailand is entering an era where electoral outcomes can no longer be explained by a single story or a single number. This model does not claim certainty. Instead, it clarifies which political assumptions produce which futures. By showing how national waves, regional machines, candidate networks, and party switching reshape the seat map in different ways, it illuminates what is truly at stake in the 2026 election.'


That, ultimately, is what a forecast should do.

 

 
 
 
bottom of page