Not Seeing is Believing: How Seeing AI and KNFB Reader cut size and price to literally put access to print into the hands of blind readers
By Aser Tolentino, Assistive Technology Instructor
This summer, Microsoft released an app called Seeing AI. It did so with relatively little fanfare when you consider just what the app represents. The app, currently available on iOS with an Android version to follow, provides access to powerful algorithms that recognize not only text, but barcodes, faces and even basic scenes. And in a world where proprietary assistive technology still commands asking prices in the thousands of dollars, Microsoft provided this app for free.
Twenty-five years after the passage of the Americans with Disabilities Act (ADA), Americans are beginning to see the signs of mainstream adoption of accessibility by electronics manufacturers, major retailers, and consumer brands online and in physical space. All televisions that provide Internet connectivity sold in the United States are now required to provide voice guidance to blind and visually impaired users, as well as closed-captioning for the hearing impaired. The same is true of cell phones, and in a sign of a rising tide lifting all boats, tech giants like Apple, Google and Microsoft have invested incredible amounts of money and manpower into making their products usable right out of the box by people with a wide spectrum of disabilities. Part of this groundswell is doubtless due to a great deal of regulation on the subject, as well as the need to conform to accessibility standards to qualify for things like government procurement contracts and grants for federal funding to support purchases from educational institutions. It also is true, however, that blind and visually impaired consumers are more numerous than ever and just as eager to partake in the latest tech trends as their fully sighted counterparts.
And while it is far too soon to claim that the goal of universal access is near fruition, it might be instructive to remember just how far we’ve come. At about the time of the ADA’s passing, reading machines equipped with optical character recognition sensors and text-to-speech synthesizers cost upwards of $12,000 when fully equipped. According to this New York Times article, the Holy Grail of assistive technology professionals in the early 1990s was the production of a reading machine priced at just $1,000. This would make it a bargain compared to the first Kurzweil Personal Reader that debuted 17 years before: that then miraculous invention was the size of a washing machine and cost $50,000. Now in the year 2017, the descendant of the Kurzweil Reader, the KNFB Reader, is a phone app that costs a tenth of that once almost preposterous goal.
At the same time, the technologies powering these types of products were developing with unbelievable speed. While the acceleration of the rate at which computers process information has begun to slow, it grew geometrically for decades. Put differently, just as the computers on the Apollo missions had less brainpower than a graphing calculator, the best computer you could purchase 20 years ago had less power than the second-hand smartphone that many children will inherit over the next few months from their parents.
Those phones, the iPhone 6 for instance, are capable of running the KNFB Reader, a collaboration between Ray Kurzweil and the National Federation of the Blind. This app debuted in 2014 but has its roots in a series of devices stretching back almost a decade further.
Back then, designers saw the potential of several devices gaining widespread acceptance: the personal digital assistant and the digital camera. Combining the two with a slimmed down version of the OCR software found on computers led to the introduction of the first incarnation of the KNFB Reader. This product was revolutionary for its portability and speed. On the other hand, it also actually consisted of two devices, a PDA and a point-and-shoot digital camera held together in a plastic frame, which cost around $2,000. While people were excited by the possibilities of these new technologies, something on the horizon was causing even more of a stir: the advent of the smartphone.
It’s hard to imagine now, but less than a decade ago, not everyone was so closely wedded to a phone. This device was carried around for use as a phone, perhaps to send text messages on, or if the user was really cutting edge, as a gaming or GPS device. In 2007, though, the KNFB Reader jumped to what is still considered an iconic device for a generation of blind tech enthusiasts, the Nokia N series of smartphones. This phone didn’t have a touchscreen, or a full keyboard for that matter. What it did have, though, was a 5-megapixel sensor, xenon flash and a Zeiss lens for taking high-resolution photos. Truth be told, even this camera’s performance too is dwarfed by modern devices, but its then revolutionary fidelity was enough to compete with dedicated digital cameras and made it possible for the KNFB Reader to exist on a device that also functioned as a fully-featured phone with text-to-speech capabilities. As a dedicated blindness solution, however, it still cost upward of $2,000.
Fast forward to the summer of 2014 and a year of development after the release of the iPhone 5S with its 8-megapixel camera, LED flash and a dual-core processor, and you have the right conditions for a transformation of how blind people receive access to technology. Rather than the requirements of the technology dictating what platform would be selected, the mainstream off-the- shelf hardware available to millions, soon to be billions, of people would be sufficient to make print accessibility to the masses a reality. It now cost just $100. And that brings us to today.
When Microsoft leveraged the raw computing power of its artificial intelligence research and cloud servers as part of a project to provide a tool to help visually impaired people learn about the world around them, the venerable tech company elected to make the result freely available. Seeing AI does more than just turn pictures of documents into navigable text, it also reads shorter text like signs in real time and bar codes to retrieve relevant product information, and identifies currency. What’s more, it attempts to describe scenes using artificial intelligence, and identifies people and their facial expressions. The one catch is that this application requires a connection to Microsoft to function. However, in an age when we are swiftly approaching ubiquitous connectivity for much of the world’s population, this is a modest obstacle when compared to the insurmountable economic realities that used to make these sorts of capabilities the stuff of science fiction.
So that’s where we’ve gone in a little more than 40 years. The mind boggles to think where we’ll go from here. Powered by the spirit of innovation and equal access that made all this possible, hopefully we’ll find ourselves more equal than ever before.