Tagalog Graphemic Module (LGM v1.0)


Modern Alphabet (28 letters)

Tagalog uses the Filipino alphabet — a Latin-based set of 26 letters (A–Z) plus Ñ and the digraph Ng as a single phoneme.


Vowels (5)

GlyphLatin ChainPhoneme (IPA)Notes
Aa/a/open front vowel
Ee/ɛ/mid-front vowel
Ii/i/close-front vowel
Oo/o/mid-back vowel
Uu/u/close-back vowel

Consonants (21 core)

GlyphLatin ChainPhoneme (IPA)Notes
Bb/b/
Cc/k/ or /s/ in loansnot native
Dd/d/
Ff/f/ in loansnot native; often /p/ in native speech
Gg/g/
Hh/h/
Jj/d͡ʒ/ in loansnot native; often /h/ or /dy/ in adaptation
Kk/k/
Ll/l/
Mm/m/
Nn/n/
Ñn + tilde/ɲ/palatal nasal (as in señor)
Ngn + g/ŋ/velar nasal; single phoneme in Tagalog
Pp/p/
Qq/k/ in loansnot native
Rr/ɾ/alveolar tap/trill
Ss/s/
Tt/t/
Vv/v/ in loansnot native; often /b/ in native speech
Ww/w/
Xx/ks/ in loansnot native
Yy/j/palatal approximant
Zz/z/ in loansnot native; often /s/ in native speech

Multi-letter graphemes

  • Ng → /ŋ/ (velar nasal) — always a single sound, never “n” + “g” in native words.
  • In our LGM, treat Ng as phoneme_unit=true so parsing doesn’t split it.

Historical Layer (Baybayin mapping)

Tagalog historically used Baybayin, an abugida with inherent vowel /a/ and diacritics for /e,i/ and /o,u/, plus a virama mark to cancel vowels. Each Baybayin symbol maps directly to a Latin base consonant + vowel combination.

Example mapping:

  • ᜃ (ka) → k + a
  • ᜃᜒ (ki) → k + i
  • ᜃᜓ (ku) → k + u
  • ᜃ + virama → k (no vowel)

System Integration Notes

  • Modern Tagalog is entirely phonemic in spelling; what you see is what you pronounce.
  • Loan letters (C, F, J, Q, V, X, Z) are fully part of the alphabet but occur mostly in borrowed words.
  • Ng and Ñ are critical phonemic units — must be preserved in any transliteration or speech model.