Automated Classification of Images and Diagrams for Accessible Digital Textbooks

Gerardo Capiel's picture
May 8, 2012 - 23:31 -- Gerardo Capiel
Revision #25ForkRecommend a Solution

Students who are blind or have other visual impairments need textbooks and other books that are accessible via text-to-speech and refreshable braille.  For all content in textbooks and books to be accessible to students, educational concepts conveyed by images and diagrams must be described.  Descriptions are typically done by subject matter experts (SMEs).  Today most textbooks and other educational books lack these descriptions.  

Given that complex images and diagrams are best described by SMEs, we have been identifying ways to streamline the process of grouping, categorizing and prioritizing images, so that SMEs can use our Poet tool to "batch" describe images or leverage prior descriptions that are contextually relevant.  We would like to challenge developers to develop algorithms based on existing open source technologies, such as OpenCV and Exemplar-SVM, to group similar images (e.g. group all pie charts, images used for formatting) and categorize images (e.g. math expressions in images, bar graphs, simple shapes, photos, faces).  The technical skills required to work with OpenCV are Python, C and C++, MATLAB and C++ with Exemplar-SVM and Ruby on Rails with Poet.

Other ideas for streamlining the process of describing images via Poet are welcome.

Benetech, the author of this problem statement, will be providing expert advice during the event to ensure your work has the most impact. We encourage everyone working on this problem to click the Get Involved button on the upper right of this Problem Definition page and join this Google Group to let the team know which location you will be attending and what you would like to work on so event organizers and Benetech can coordinate collaboration.


  • Example of an illustration of the hydrologic cycle that needs to be described and a sample description
  • Collection of Open Educational Resources that are freely downloadable from Bookshare that need image descriptions.
    • To quickly view all the images in a book in this collection:
      1. Click on the title of a book in the OER collection and download the DAISY file or request a download of this example Algebra II book.
      2. Upload the downloaded zip package to Poet (sign up for an account - it's free)
      3. Save the book ID in your notepad
      4. After 15-20 minutes, go to describe your uploaded book.
      5. You will see all the images in the book on the left hand pane.  Note if you use the example Algebra II book, you will find a mix of diagrams and formulas in images, which may need descriptions or MathML markup.
  • Guidelines from the National Center for Accessible Media (NCAM) for image descriptions
User Stories: 
  • As a volunteer I want to quickly identify all images and diagrams that are related to my subject matter expertise, which need description.

  • As a volunteer I need tools, such as MathTrax (example), to help me quickly and systematically describe images or input MathML.

  • As a volunteer, I want to sort all images by type so that I can describe all the similar ones at once. 

  • As a visually impaired student, I need to be able to listen to image descriptions via text-to-speech, other auditory or tactile mechanisms, such as braille descriptions, tactile graphics or printable 3D shapes.

  • As an special needs educator, I need to be able to quickly find images (ideally under an open license, such as Creative Commons) that have alternative representations, such as descriptions or tactile graphics.

  • As an author or publisher, I want to find images similar to mine that have already been described so I can see other examples of how such an image is described.

  • Solutions must be open source, specifically a non-copyleft OSI approved license, such as the Revised BSD or The BSD 3-Clause License

  • Ideally solutions integrate with the open source Poet tool (overview, GitHub)

  • Solutions should integrate with existing standards for accessibility, such as the DIAGRAM Content ModelDAISY, EPUB 3, BRF, SVG and MathML.

Similar Projects and Resources: 

We encourage developers wishing to tackle this solution to leverage:

Next Steps and Sustainability: 

Benetech, a 501c(3) nonprofit software organization, is the author of this problem definition.  Benetech is a partner in managing the DIAGRAM Center and operates Bookshare, the largest library of accessible educational titles.  Currently Bookshare serves nearly 200K students across over 40 countries and over 10K schools in the U.S.  Thus Benetech is well poised to pilot, productize, deploy and market promising solutions.  Promising solutions may also be integrated back into the Poet application or may become future projects for the DIAGRAM Center.

Qualitative Impact: 
The Individuals with Disabilities Education Act of 2004 (IDEA) calls for timely access to educational materials. Through projects such as Benetech’s Bookshare for Education, access to text has greatly increased. Yet, educational materials include a wide array of other types of content. The burden of accessible image preparation typically falls on educators, who have limited time and tools to create useful descriptions or accessible graphics for students. Too often, students using text-based accessible instructional materials (AIM) are presented with only the words "image" or "graphic" when the devices they use to read digital text encounter illustrations, equations, graphics, photos or diagrams in textbooks.
Quantitative Impact: 
According to World Health Organization estimate approximately 1.4 million children ages 0-14 years worldwide are blind. The U.S. Department of Education states that over 25K students are receiving special education under the Individuals with Disabilities Education Act (IDEA) due to vision impairments.
Problem Definition Category: