Why autocorrect on the iphone is still so bad

The iPhone's ability to correct our words is still as bad as it was ten years ago. This is no coincidence.

You want a simple sentence like "I would certainly take out this insurance policy!" and it comes up as "I would certainly shoot this insurance down!" out. Auto-correct errors are so commonplace, and have been for so long, that we hardly take note of them anymore, unless they're unintentionally hilarious – or the insurance agent is wondering. ( In the Macwelt we have always been amused about a wonderful autocorrect-fail, which turned an admittedly incorrect word like "benchmarken" into a wonderfully absurd "beschnarchen". Yes, our test center has thoroughly snorted at many a device over the years … Anm. d. Red. )

But why is autocorrection still so bad today and in many cases an aggravation?? The iPhone appeared on the scene 15 years ago – the device that introduced and popularized touch-controlled keyboard input – and autocorrect has been with us in one form or another since the 1990s, when Word automatically corrected accidental capital letters or common spelling mistakes.

After decades and billions of devices sold, not to mention the meteoric rise of machine learning and artificial intelligence, autocorrect feels as dumb as ever. In some respects, it even seems to have regressed, making nonsensical substitutions when a simple letter swap would yield the correct word. Is the technology of autocorrect just very complicated? Or does it not even try to work the way it should? Apple no longer gives it priority?

The march of the nines

I first heard of the concept of the "March of Nines" about 20 years ago (although I don't know where that term came from). I researched and wrote about the latest dictation software. This was back in the days when computer users had to buy software like Dragon Dictate to talk to their devices.

Dictation software that is 90 percent accurate sounds good, but is worthless. If you have to correct only one word out of ten, you don't really save much time. Even 99 percent accuracy is not really good enough. At 99.9 percent, it gets interesting: If you can make your computer 1.000 words to dictate and only have to correct one of them, you have a huge time savings (not to mention an incredible accessibility tool).

But 99 percent accuracy is not just 9 percentage points ( i.e. ten percent, Anm. d. Red. ) better than 90 percent. It's even 1.000 percent better – a tenfold improvement – as the error rate goes from one error per ten words to one error per hundred words.

For every "nine" you slap on the accuracy of an automated process, it gets only slightly better for humans, but you have to improve it tenfold to achieve that goal. In other words, 99.9999 percent doesn't feel much better to a user than 99.999 percent, but it's still ten times harder to the computer.

Stuck in a "march of the nines" autocorrect? Does it secretly make giant leaps that seem vanishingly small to us? I do not think so. The error rate of autocorrection is still quite high, while the computing power available to it (especially for machine learning tasks) is a hundred times higher than it was ten years ago. I think it is time to look for other solutions.

Natural language processing that isn't

Whether it's voice assistants like Siri or Alexa, voice dictation or autocorrect, technology companies like to talk about "natural language processing". But true natural language processing is beyond the reach of all these systems. What we are left with is a machine learning based statistical analysis of parts of speech that is almost completely devoid of semantic meaning.

Imagine the following: "Go to the grocer on the corner and get me a stick of butter. Make sure it's unsalted."

If I were to ask someone what is meant by "she," everyone would immediately know that I was referring to the butter, although grammatically "she" could just as easily refer to the grocer in her store. But who has ever heard of an unsalted businesswoman? If we change the second sentence to "Check if she is also open today", we know that "she" refers to the grocer as a personified store.

It's pretty trivial for humans, but computers are terrible at it, because language systems are built in such a way that they don't know what words actually mean, only what kinds of words they are and how they are spelled.

All of these speech-based systems (voice assistants, dictation, autocorrect) rely on large numbers of poorly paid contractors to take speech samples or text sentences and meticulously tag them: noun, verb, adjective, adverb, swear word, proper noun, and more. The computer language system may know that the misspelled word should be "soup" if you typed "Try this soup I just made" because it should be a noun and most of the letters match the unword you accidentally typed. But it doesn't know what soup actually is. Neither did the other words in the sentence: taste, made, just…

I think this is the real reason why autocorrect is still so bad. It doesn't matter how sophisticated the machine learning is or how large the training set is if you don't know what the words mean, not even superficially.

Just nifty statistical analysis

Google automatically predicts whole sentences for you in Gmail, but even that is only a very sophisticated static analysis. Using machine learning to determine which phrases most often follow the words you just used when responding to an email with a specific distribution of keywords and phrases. It still doesn't know what it all means.

To use my original example: The autocorrect suggested to shoot down the insurance, not knowing that this is a nonsensical sentence, even with a connotation of violence. If my iPhone knew what any of these words actually meant, and not just their grammatical roles, it would be easy for autocorrect to only make suggestions that correspond to, you know, possible human speech. (The fact that an implicit call to violence or a flippant synonym for "give notice" was suggested – although the opposite was meant, of course, only shows how bad autocorrect still is).

Autocorrect no longer seems to be a priority

Fact is, autocorrect is no longer the priority it once got. When was the last time Apple announced a massive leap in auto-correct accuracy when introducing a new iOS??

At the beginning of the smartphone era, when we all got used to typing with big thumbs on tiny touchscreens, the ability to correct the mistakes of our fat fingers was a big selling point. It was a key feature that pointed to a device's elegant, user-friendly software.

Autocorrect with all its bugs is now old and boring. We've put up with their weaknesses for so long that the market no longer considers them a mark of usability. We've moved on to other things, like fancy camera features and notifications. I'm sure there are smart, hard-working engineers at Apple and Google working on autocorrect, but they probably only get a fraction of the resources given to the team responsible for making marginally better photos, because marginally better photos can sell smartphones, marginally better autocorrect cannot.

It will take a huge leap in AI modeling and performance before our smartphones can recognize the semantic meaning of words. But already much more could be done to filter out nonsensical sentences and nonsensical autocorrect suggestions that lead to meaningless ramblings.

I would be just happy if there would be any improvement at all. Anything to get autocorrect out of the rut it's in. Because with all our typos on the iPhone, it's also not very helpful to shoot down autocorrect.

Editor's note

The author's premise that modern language-processing systems can understand the semantic meaning of the text or. of its components and rely only on sophisticated statistical algorithms, has not been true for some years now. Google, for example, has been implementing Natural Language Processing Engine for its search for more than two years, which can filter out not only semantic units of the text, but also the relationships between them, as well as the emotional connotation of the entire text. You can also analyze your own text at the link if you wish.

Apple is also researching to further improve its language engine . Siri is based on the engine from Nuance (Dragon Dictate), but Apple has since further developed its voice assistant. That Apple's speech recognition works quite well, proves the current function of iOS 15 and macOS Monterey Live Text . In our practice, we only had to remove the line breaks from the text, Apple's intelligence has recognized the content to the just desired 99 percent correctly.

Why, on the other hand, autocorrect still works so poorly is anyone's guess. Obviously, the latest findings from linguistics bwz. Computational linguistics for some inexplicable reason, and the system relies on the old purely statistical methods of prediction. But there are enough third-party tools to prove that language research has taken a giant step forward in the last five years. Our favorite example is our correction tool Language Tool Pro : The differences between "das" and "dass" – a quite common error in German – are recognized by the tool without any problems. But this would not be possible without semantic recognition. hk