PGM-10 — Italian (IT_LATN)


MINTED

Purpose

Precise, reversible mapping between phones/phonemes and modern Italian orthography, with support for predictable pronunciation rules, stress marking, loanword handling, and controlled-loss ASCII folding. Integrates with the MCLI for cross-script compatibility.

Identity

pgm::v1.0::IT_LATN::<profile>

Orthography Profiles

  • IT_LATN.std — Standard orthography (no stress marks except in minimal pairs / distinguishers).
  • IT_LATN.pedagogic — Adds phonemic stress marks and diacritics where predictable stress is overridden.
  • IT_LATN.ascii — ASCII fallback; loss controlled (diacritics removed, stress not marked).
  • IT_LATN.ipa — Direct IPA output for linguistic purposes.

Lossiness

  • std and pedagogic: none (orthography is highly phonemic).
  • ascii: controlled (diacritics and optional apostrophes for elision lost).
  • ipa: none.

Script Mechanics

  • Alphabet: 21 native letters (A–Z without J, K, W, X, Y) + loan letters (J, K, W, X, Y) in foreign words.
  • Digraphs/trigraphs:
    • ch/gh → /k ɡ/ before e, i.
    • ci/gi → /t͡ʃ d͡ʒ/ before a, o, u.
    • gli → /ʎ/; gn → /ɲ/.
    • sc → /ʃ/ before e, i; /sk/ elsewhere.
  • Vowel system: /a e ɛ i o ɔ u/; mid vowels alternate by stress (penultimate vs antepenultimate).
  • Consonants: gemination is phonemic and written except in /ts dz/ affricates from z; assimilation across word boundaries in speech, not marked in orthography.
  • Stress: predictable (penultimate syllable) unless marked with grave (`) or acute (´) on final vowel in writing; acute is rare, grave marks open-mid vowels in final position.

Phoneme Inventory (MCLI-linked)

Vowels: /a e ɛ i o ɔ u/ (all short, length allophonic).
Consonants: /p b t d k ɡ t͡s d͡z t͡ʃ d͡ʒ f v s z ʃ m n ɲ ʎ r l/ + geminated forms.


Mapping Logic

Phones → Graphemes

  1. Identify stress location: if predictable, omit mark in std; add mark in pedagogic for all stress positions.
  2. Map consonants to single letters or digraphs per context:
    • /k ɡ/ + [e i] → ch/gh; + [a o u] → c/g.
    • /t͡ʃ d͡ʒ/ before [a o u] → ci/gi.
    • /ʎ/ → gli; /ɲ/ → gn.
    • /ʃ/ before [e i] → sc; elsewhere → sci.
  3. Apply vowel quality rules: /e/ vs /ɛ/ and /o/ vs /ɔ/ based on stress/open/closed syllable rules.
  4. Gemination: double consonant letter except for z (rules preserve phonemic doubling).

Graphemes → Phones

  • Apply reverse context rules for digraphs.
  • Derive vowel quality from stress and syllable type.
  • Restore gemination from double letters.

Edge Policies

  • Loanwords: keep foreign spelling in std; adapt to phonemic Italian spelling in pedagogic.
  • Elision: optional apostrophe retention; ascii folding drops it.
  • Stress override: required on final syllable stress (per orthography), optional in pedagogic output for learning.

YAML Skeleton (engine spec)

pgm_version: "1.0"
language: "IT"
script_pref: ["IT_LATN","IT_ASCII","IT_IPA"]

profiles:
  - id: "std"
    orthography_profile: "IT_STD_2025"
  - id: "pedagogic"
    orthography_profile: "IT_PEDAGOGIC_2025"
  - id: "ascii"
    orthography_profile: "IT_ASCII_2025"
  - id: "ipa"
    orthography_profile: "IT_IPA_2025"

inventory:
  digraphs:
    - {ipa: "k", context: "[e i]", map: "ch"}
    - {ipa: "ɡ", context: "[e i]", map: "gh"}
    - {ipa: "t͡ʃ", context: "[a o u]", map: "ci"}
    - {ipa: "d͡ʒ", context: "[a o u]", map: "gi"}
    - {ipa: "ʎ", context: "any", map: "gli"}
    - {ipa: "ɲ", context: "any", map: "gn"}
    - {ipa: "ʃ", context: "[e i]", map: "sc"}
    - {ipa: "ʃ", context: "elsewhere", map: "sci"}
  vowels:
    - {ipa: "a", map: "a"}
    - {ipa: "e", map: "e"}
    - {ipa: "ɛ", map: "è"}
    - {ipa: "i", map: "i"}
    - {ipa: "o", map: "o"}
    - {ipa: "ɔ", map: "ò"}
    - {ipa: "u", map: "u"}
operators:
  - {name: "apply_stress_rules", fn: "mark_final_if_needed_in_std; mark_all_in_pedagogic"}
  - {name: "gemination", fn: "double_letter_except_z"}
  - {name: "ascii_fold", fn: "remove_diacritics_and_apostrophes"}
lossiness:
  to_ascii: "controlled"

Unit Test Fixtures

tests:
  - id: "IT_001_ciao"
    in_phonemes: "/t͡ʃ a o/"
    profile: "std"
    expect: "ciao"

  - id: "IT_002_gemination"
    in_phonemes: "/f a t t o/"
    profile: "std"
    expect: "fatto"

  - id: "IT_003_stress_final"
    in_phonemes: "/perke/"
    stress: "final"
    profile: "std"
    expect: "perché"

  - id: "IT_004_ascii"
    in_graphemes: "perché"
    profile: "ascii"
    expect: "perche"

  - id: "IT_005_pedagogic"
    in_phonemes: "/a m i k o/"
    stress: "penult"
    profile: "pedagogic"
    expect: "àmico"

Worked Micro-Examples

  • /t͡ʃ a o/ciao (hello).
  • /f a t t o/fatto (“done”) — geminated /t/ doubled in writing.
  • /perˈke/perché (“why/because”) — final stress marked.
  • ASCII fold: perché → perche.

Operational Knobs

  • pgm.profile=std|pedagogic|ascii|ipa
  • pgm.stress_marking=final_only|all|none
  • pgm.ascii.mode=strip_diacritics|strip_all_marks
  • pgm.lossiness_report=true
  • pgm.audit_trace=true

PGM-10 (Italian) is MINTED and integrated into the Master Cross-Lattice Index.