EyeRead: a design for a low-vision aid powered by computer vision
Jacob Lee: VP of Finding Issues
Cole Purcell: Product designer, document manager
Ryan Flores: writer
Andy Roland: Nitpicker
Over the last decade, people’s daily activities have been increasingly moving from the physical world to the digital. As a consequence of this online transition, information is often portrayed visually, like through text, pictures, and videos. While this adds convenience to many sighted people’s lives, we also need to consider what consequences this might have for people who aren’t able to access this visual information. As our team researched the experiences of people with low vision, we found many people who expressed frustration at the lack of screen reader compatibility with many phone apps, desktop apps, and a general lack of other accessibility features in other screen-based interfaces such as self-checkout kiosks. It became clear to our team that the design we should create needed to offer a more general solution that could adapt to different situations, especially those where a screen reader is unavailable.
Our solution takes the form of a pair of glasses, which includes a camera, a hand tracking module, a microphone, and bone conduction earphones, and would be able to connect to a user’s phone via Bluetooth. While wearing the glasses, users can point towards an article of text, and initiate reading through either a voice command or a programmable hand gesture. The glasses will be able to use optical character recognition to process the full text. Users can then skip to important parts of the text, and even search the text given a query. Additionally, our design also uses computer vision techniques to navigate screen-based user interfaces. When a user uses the glasses to read an interface, our design would capture both the text and each element of the interface. The user could ask for the location of elements such as a search bar or a menu icon, and the glasses would then be able to guide the user’s finger to the right place. With these features, the glasses could act as a substitute for a screen reader wherever accessibility features aren’t offered.
Paper Prototype, Testing Process, and Results:
Initially, our design was a handheld device similar to a barcode scanner, which had a larger variety of features. However, there were several issues with this design. The first was that the form factor made many daily activities more difficult due to the need to constantly hold the device during use. We also found that testers struggled with the large number of buttons and features. To address the issues of the form factor and the complexity of the initial design, we replaced the barcode scanner design with a pair of glasses with only a single multi-function button.
The pair of glasses could be operated hands-free and had a more focused scope of features. The glasses initially only had a camera, a microphone, a speaker, and an onboard computer. Testing revealed that it was difficult to orient the camera by looking around, so we added a hand tracking module that could be used to guide the camera’s attention. To address the increased weight of the device, we removed the onboard computer and instead used a Bluetooth connection to the user’s phone which would be used to perform calculations instead. We also replaced the speaker with bone conduction headphones so that it wouldn’t interfere with a user’s senses. With the physical design for the glasses in hand, we were then able to further refine the functionality of the device and create a digital mockup of the app which would be used to operate and configure the glasses.
As the glasses house most of the functionality of our design, the glasses’ companion app is mostly for settings and customization purposes and will be optimized for screen reader use. Our home page has a button to read text, an icon to show if Bluetooth is connected, and a settings button.
Inside the Settings, we have different options to select. Help will give a general overview of the directions to use the app and some frequently asked questions. In Sound, you can change the volume, reading speed, and the behavior when audio is already being played when audio feedback is necessary.
The next two options are voice controls and motion controls. These will be available to customize options for reading text and navigating UI’s. Right now we just have three options showing for reading but tasks such as navigating to the search bar or the beginning of the page can be added to both of these control settings. You will also not be limited to three commands.
To read text on a page, the user would point at an article of text. The user could then press the read button on the app, use a motion gesture that they defined, or say “begin reading”. The user would then be prompted to decide whether to begin at the start of the article of text, the start of the block of text being pointed at, or exactly where the user is pointing. Once the glasses begin reading text, the user can skip through sentences or paragraphs, or return to previous points in the text.
Navigating a UI is a similar process. A user can point at an interface, such as a restaurant’s online menu on a tablet, or an app on their desktop PC, and say “read screen” or make the equivalent gesture. The user can then move their pointer finger around and the app will read out any buttons, menus, or other interactive elements, and the user will have access to the same text-reading capabilities as with reading physical text. The glasses will also allow the user to search for a UI element given a prompt such as “where can I find the dropdown menu”, “where is the search bar”, “how do I return to the home page”, and the glasses will verbally guide the user’s pointer finger to the requested element. This could also be used by the glasses tracking the mouse on a screen instead of where the user points with their hand.
Since the preliminary mockup, we have changed some of our button titles and written a small amount for what would be in our help section.
From our user research, we found that screen readers often lack adequate compatibility with certain websites or apps and that there is currently no way to read text on paper or similar surfaces that is both reliable enough to depend on in dire circumstances or versatile enough to use in a wide variety of situations. Our EyeRead fills in that gap, by giving users a means to read written or printed text as well as text on screen and UI elements. By using both computer vision techniques and machine learning, EyeRead ensures that it remains reliable in any circumstance when it comes to reading text for its user, whether it’s on a screen or a sheet of paper. In addition, EyeRead’s flexibility when it comes to user input gives users the versatility they need to live more independent lives.