Swahili Graphemic Module (LGM v1.0)


Alphabet (24 letters)
Swahili uses the English Latin set except:

  • No Q, V, or X in native words (they only appear in loans).
GlyphLatin ChainPhoneme (IPA)Notes
Aa/a/open vowel
Bb/b/
Cc/t͡ʃ/ before I/E onlyalways “ch” sound in native words
Dd/d/
Ee/ɛ/mid-front vowel
Ff/f/
Gg/g/always hard /g/
Hh/h/
Ii/i/high-front vowel
Jj/d͡ʒ/
Kk/k/
Ll/l/
Mm/m/may be syllabic at word start (mtaa)
Nn/n/may be syllabic or prenasalizing
Oo/ɔ/mid-back vowel
Pp/p/
Rr/ɾ/tap/flap
Ss/s/
Tt/t/
Uu/u/high-back vowel
Ww/w/approximant
Yy/j/approximant
Zz/z/

Interpretation Layer (Definition → Application)

Swahili orthography is phonemic:

  • Each letter has one consistent sound, no silent letters.
  • Digraphs/trigraphs like “ng’” /ŋ/ or “ny” /ɲ/ are multi-letter graphemes that act as single phonemes in our system.

Multi-letter Graphemes in Swahili

  • ch → /t͡ʃ/ (spelled with c + h, always affricate)
  • sh → /ʃ/
  • ng → /ŋg/ (prenasalized stop)
  • ng’ → /ŋ/ (velar nasal only)
  • ny → /ɲ/ (palatal nasal)

We treat these as grapheme clusters with phoneme_unit=true in LGM metadata.


How This Fits the Lattice

Because Swahili uses the Latin framework, our universal system can:

  • Directly integrate each grapheme into the existing Latin chain.
  • Flag its phonemic interpretation so we avoid English bias (e.g., “g” is never /d͡ʒ/ in Swahili).
  • Preserve multi-letter graphemes as single phoneme nodes for morpheme assembly.