Phonetic Symbol Downgrades in Amharic

Downgrades of some letters occur "vertically", that is the conversion occurs in a single direction and is not reversibly:

Vertical Only
[#ፀ#]

[#ጸ#]
[#ሠ#]

[#ሰ#]
[ዑዒዔዖ]

[አኡኢኦ]










We can represent these rules with the following collection of translierations:

tr/ፀ-ፆ/ጸ-ጾ/;
tr/ሠ-ሧ/ሰ-ሷ/;
tr/ዑዒዔዖ/ኡኢኤኦ/;
tr/ሗቍኵጕኧ/ኋቁኩጉእ/;

These transliterations in turn can be combined into a single representation:

tr/ሗሠ-ሧቍኵጕኧዑዒዔዖፀ-ፆ/ኋሰ-ሷቁኩጉእኡኢኤኦጸ-ጾ/;

Lateral downgrades occur in a number of cases. Lateral conversions are bidirectional and may occur in either direction:

Lateral Only
ዕ ⇄ እ ቈ ⇄ ቆ ኰ ⇄ ኮ ጐ ⇄ ጎ

Note that care must be taken not to retransliterate characters already transliterated (e.g. "ዕ" converted into "እ" should not be converted again into "ዕ"). We update our transliteration to include these additions:

tr/ሗሠ-ሧኧቍኵጕቈኰጐቆኮጎእዑዒዔዕዖፀ-ፆኹኺኼኽኾ/ኋሰ-ሷእቁኩጉቆኮጎቈኰጐዕኡኢኤእኦጸ-ጾሁሂሄህሆ/;

"እ" as the first character of a word is more stable and less likely to be substituted for "ዕ". To reduce the number of generated improbable word renderings, "እ" is not converted at the start of a word.

The downgrade rules of the "[=#ሀ#=]" and "[=#አ#=]" families are more complex, both vertical and lateral downgrades apply. Note that two conversions of a single character are possible when both a vertical and lateral conversion apply: The decay of "ኍ" is more complex than other forms as its target may further decay:

Lateral and Vertical
 ኍ
 ↓
[ኁኂኄኅኆ] ⇄ [ሑሒሔሕሖ]   [ኹኺኼኽኾ]
⇘                ⇓             ⇙    
[ሁሂሄህሆ]
  ዓ ⇄ ዐ   ኣ
 ↘ ↓ ↙ 

It is uncertain if any canonical renderings exist with the set [ኹኺኼኽኾ], they have been added for completeness.

We update our transliteration for ኣ and ኹኺኼኽኾ only:

tr/ሗሠ-ሧኣእኧቍኵጕቈኰጐቆኮጎዑዒዔዕዖፀ-ፆኹኺኼኽኾ/ኋሰ-ሷአእቁኩጉቆኮጎቈኰጐዕኡኢኤእኦጸ-ጾሁሂሄህሆ/;

For clarity we will now represent the other downgrades in new transliteration and substitution operations:

tr/ኁኂኄኅኆ/ሑሒሔሕሖ/;
tr/ሑሒሔሕሖ/ኁኂኄኅኆ/;
tr/ሑሒሔሕሖኁኂኄኅኆ/ሁሂሄህሆሁሂሄህሆ/;
s/[ዓዐ]/አ/;
tr/ዓዐ/ዐዓ/;
s/ኍ/ኁ/;
s/ኍ/ሁ/;
s/ኍ/ሑ/; (note: this seems a hair unlikely)

Note that where two target conversions occur the string operated on is duplicated and a single target conversion applied to each. For example, the string "ዓለም" converts to the two new strings "ዐለም" and "አለም". Future conversions will be applied to both outcome strings identically.

The most complex downgrade occurs for letters in [=ሃ=] which occur multi-lateral and multi-vertical targets. Each node in the path of the diagram is a valid alternative from a given starting point. "ሐ" and "ኃ" are shown at opposite extremes as the conversion of one into the other is less likely than those in closer proximity. Though certainly may be found, for example "ሐምሌ" vs "ኃምሌ":

Lateral and Vertical in [=ሃ=]
ሐ ⇄ ኀ ⇄ ኃ
 ↘ ↙ ↘ ↙ 
  ሀ ⇄ ሃ  

We update or transliteration operation for the lateral transformation of "ሀ" and "ሃ" only which have single targets:

tr/ሀሃሗሠ-ሧኣእኧቍኵጕቈኰጐቆኮጎዑዒዔዕዖፀ-ፆኹኺኼኽኾ/ሃሀኋሰ-ሷአእቁኩጉቆኮጎቈኰጐዕኡኢኤእኦጸ-ጾሁሂሄህሆ/;

Conversions that occur for the other members in "[=ሃ=]" will be as follows, not that a new word is generated for each applicable conversion:

s/[ሐኀኃ]/ሀ/;
s/[ሐኀኃ]/ሃ/;
s/[ሐኀ]/ኃ/;
s/[ኀኃ]/ሐ/;
s/[ሐኃ]/ኀ/;

No canonical renderings are known for "ሓ" and "ኻ", they are added here for completeness and should be handled by software. "አ" at the start of a few special words ("ኃገር", "ሀገር", "ሃገር" with "አገር") may become an alternative rendering for words starting in "[=ሃ=]". It is uncertain as to which rendering (starting with "አ" or "[=ሃ=]") is in fact the canonical, thus the arrows shown may be bidirectional and the relation then lateral. However, as the words for this occurs are very few given the whole of words staring in either "አ" or "[=ሃ=]", software should not attempt to generate these renderings as the outcome will almost always be invalid. "አ" is presented in the follow for academic interest:

Lateral and Vertical[=ሃ=] Revised

ሐ ⇄ ኀ ⇄ ኃ        
 ↘ ↙ ↘ ↙         
    ሀ ⇄ ሃ  ←  ኻ
↘ ↙        
አ       

Our last collection of conversions in "[=ሃ=]" are updated accordingly:

s/[ሐኀኃኻሓ]/ሀ/;
s/[ሐኀኃኻሓ]/ሃ/;
s/[ሐኀሓ]/ኃ/;
s/[ኀኃሓ]/ሐ/;
s/[ሐኃሓ]/ኀ/;