Amharic / Geʽez Graphemic Module (LGM v1.0)


Geʽez script (ግዕዝ)

Script Basics

  • 33 base consonants (fidels)
  • Each has 7 vowel forms (series), giving 231 distinct graphemes.
  • Vowels are inherent — the “first form” has the vowel /ä/ (like schwa), others use diacritics or modifications.
  • Written left-to-right.

Series Structure (Example with ሀ-series)

For each base consonant, the 7 orders are:

OrderVowelExample (ሀ-series)Latin ChainPhoneme (IPA)Notes
1äh + ä/hä/inherent vowel
2uh + u/hu/
3ih + i/hi/
4ah + a/ha/
5eh + e/he/
6əh + ə/hɨ/“reduced” vowel
7oh + o/ho/

Full Base Consonant List (first few as example)

We’ll map each to its 7 forms.

  1. ሀ-series → h
  2. ለ-series → l
  3. መ-series → m
  4. ሠ-series → ś (palatal s)
  5. ረ-series → r
  6. ሰ-series → s
  7. ሸ-series → sh
  8. ቀ-series → q (uvular)
  9. በ-series → b
  10. ቨ-series → v
    … (continues through all 33 base consonants)

Literal–Graphemic Encoding for Lattice

We represent each grapheme as:

glyph: "ሄ"
base_consonant: "h"
vowel: "e"
latin_chain: ["h", "e"]
phoneme: "he"
series_order: 5

This lets the system:

  • Recognize the consonant root
  • Apply the vowel order
  • Maintain phonetic consistency

Why Amharic/Geʽez is Important in the Lattice

  • Adds a non-Latin African script that’s structurally different from Hausa/Zulu/Swahili.
  • Shows how abugidas can be integrated by decomposing into base consonant + vowel flag, same logic as Baybayin or Devanagari.
  • Enables bidirectional transliteration — Geʽez ↔ Latin — without loss.