OCR Image Reader Simple powerful OCR without server iteration
Support Development
PayPal ● 
Bitcoin Address: 1sM2BrTH8BRgt3quiASK8TmYSafutNvDo
 ● 
Dogecoin Address: DFdSGpGMZ2EZVkjyqNrYCEysK92DFPonx4
Advertisement
Screenshot
The "OCR Image Reader" extension aims to ease the optical character recognition process in your browser. After installation, the extension adds a new button to the toolbar area of your browser. When this button is pressed, the current window goes to the area selection mode. You can skip this mode by pressing the Escape key. This tool is used to select an area on the current page. This area then is sent to the OCR engine of the extension and all the text content of it will be extracted. The process of text extraction will be displayed in a popup like a floating window on each page. If you have multiple jobs, you will get multiple floating windows. The OCR engine of this extension is Tesseract.js which supports more than 100 languages and is written purely in JavaScript language. Note that on the first usage, the extension fetches the proper language database from the server, though on future use, since your browser has already cached this resource, the OCR process takes shorter. The progress of both the fetching and the OCR extraction is displayed in the popup window.

Features

  1. What is the "OCR - Image Reader" add-on and how can I use it?

    This extension is about having a simple yet powerful in-page OCR application on hand without installing a native one. The extension is pretty simple to use. Whenever you need to extract the content of an image or a text that cannot be selected, simply press the toolbar button to switch to the area selection mode. Now hold your mouse down and select the area of interest, then release the mouse pointer. At this point, the extension captures the screen area and send this image to the OCR engine. This process happens inside the page in a frame element. You can see the progress of the entire process in a popup window. The first step is to fetch the language training database from the server, then extract the text content out of the image area. Both of these processes have a progress bar. The fetch step might take some time on the first run, but should be fast for all the subsequent calls as your browser should have already cached the resource once.

  2. recommended "Dino - The Dinosaur Game" extension for Chrome, Edge, and Firefox browsers

    Time to have a break? Avoid obstacles, including cacti and pterodactyls by pressing the up and down keys. That's it! This The original game was created by Sebastien Gabriel in 2014. Read more here.

  3. What's new in this version?

    Please check the Logs section.

  4. Does this extension uses an online service for dong the text recognition?

    No the process of extracting the text content out of the image all happens locally. However, note that this extension gets the training data from a remote server since the database is about 30Mbytes and cannot be packed with the extension itself. This extension does not interact with any remote services at all except the database fetching part.

  5. Can I send a very large image to the OCR engine?

    Theoretically, you can send large images too, but it is going to take a long time for the extension to process the image. It might even require too much CPU resources to be able to extract the content. It is recommended to use the area selection tool properly to only select the required area instead of having a large image which large empty area around.

  6. What is the OCR engine of this extension?

    This extension uses the powerful Tesseract.js with online language training resources to have the latest database

  7. When I am trying to use this extension on local images, I get the "Cannot access contents of the page. Extension manifest must request permission to access the respective host" notification. Is there anyway to use this extension on local images?

    You need to create a local server and access images with the "http://127.0.0.1/..." address to be able to use this extension. On modern browsers, no extension can access the FILE scheme. So we are going to access local files with the standard HTTP scheme similar to normal web pages. Read How do you set up a local testing server documentation to set up a local server.

  8. I usually perform OCR on documents in different languages. It would be helpful if this extension could detect the language of the image and use the proper engine. Is this possible?

    As of version 0.2.3, it is possible to change the language to auto-detect. If this mode is selected, the extension performs the initial OCR on the image in three different languages (English, Arabic, and Japanese), and uses the output text to detect the content's language using the Compact Language Detector (CLD) algorithm. On successful language detection, the extension uses it to perform the actual OCR. Note that since the extension needs to grab the trained engine for multiple languages, the first detection is slow.

  9. Is it possible to post the OCR result to a server?

    As of version 0.2.4, you can define a custom server to post the result. A use case is to copy the data to a local text file without a manual copy and paste process. To configure the server, use Shift + click on the "Post Result" button. General Format:

    [GET|POST|PUT]|URL|[POST|PUT body]
    Post Example:
    POST|http://127.0.0.1:8080|&content;
    POST|http://127.0.0.1:8080|{"body":"&content"};
    Put Example:
    PUT|http://127.0.0.1:8080|&content;
    Get Example:
    GET|http://127.0.0.1:8080?data=&content;|
    The &content; keyword on the URL part will be replaced with the actual result and it is encoded (encodeURIComponent), but the &content; on the body section is not altered. You can have one instance of the &content; keyword in the URL and one instance in the body part. You can write the server code in any language such as Python, PHP, or JavaScript. Here you can find a sample code written in JavaScipt. This code is meant for NodeJS. Alter the code to fit your needs:
    const http = require('http');
    
    const server = http.createServer(function(req, res) { // 2 - creating server
      // res.setHeader('Access-Control-Allow-Origin', '*');
    
      console.log('Request URL: ' + req.url);
      console.log('Request method: ' + req.method);
    
      if (req.method === 'GET') {
        res.end();
      }
      else {
        req.on('data', chunk => {
          console.log('Chunk:', chunk);
        });
        req.on('end', () => {
          res.end();
        });
      }
    });
    
    server.listen(8080);
    Supported keywords:

    • &content; OCR result
    • &href; Document URL

Matched Content

Preview

Reviews

Please keep reviews clean, avoid the use of improper language and do not post any personal information.

What's new in this version

Version--
Published--/--/--
Change Logs:
    Last 10 commits on GitHub
    Hover over a node to see more details

    Need help?

    If you have questions about the extension, or ideas on how to improve it, please post them on the  support site. Don't forget to search through the bug reports first as most likely your question/bug report has already been reported or there is a workaround posted for it.

    Open IssuesIssuesForks

    Permissions are explained

    PermissionDescription
    storageto keep the internal preferences
    activeTabto inject area select script into the active page after a user action
    notificationsto display possible warnings during the OCR process

    Recent Blog Posts on add0n.com