From: Segmentation-free optical character recognition for printed Urdu text
Dot/Diacritic | Example image | Value | Description |
---|---|---|---|
No dots | – | −1 | Absence of dot/diacritic |
One dot above |
| 01 | One dot above baseline |
One dot below |
| 02 | One dot below baseline |
Two dots above |
| 03 | Two dots above baseline |
Two dots below |
| 04 | Two dots below baseline |
Three dots above |
| 05 | Three dots above baseline |
Three dots below |
| 06 | Three dots below baseline |
Hey stroke |
| 07 | Secondary stroke of “Aik-chashmi-hey” when used as joiner |
Gaaf kash |
| 08 | Secondary stroke of “Gaaf” |
Madda |
| 09 | Secondary stroke appearing with “Alif” forming “Alif-mad-aa” |
Hamza-e-izafat |
| 10 | Secondary stroke appearing with “Bay” class when used as joiner |
Shadda |
| 11 | Arabic like Thashdid |
Full stop |
| 12 | Full stop |
Choti toyen |
| 13 | Secondary stroke of “Thay,” “Rday,” and “Dhaal” |
Hamza |
| 14 | Secondary stroke with “Bay” class, some times appears in isolation |
Comma |
| 15 | Comma |
Question mark |
| 16 | Question mark |