OCR Image Reader Simple powerful OCR without server iteration
Support Development
PayPal ● 
Bitcoin Address: 1sM2BrTH8BRgt3quiASK8TmYSafutNvDo
 ● 
Dogecoin Address: DFdSGpGMZ2EZVkjyqNrYCEysK92DFPonx4
Your Input Matters
Review
Advertisement
Screenshot
The "OCR Image Reader" extension aims to ease the optical character recognition process in your browser. After installation, the extension adds a new button to the toolbar area of your browser. When this button is pressed, the current window goes to the area selection mode. You can skip this mode by pressing the Escape key. This tool is used to select an area on the current page. This area then is sent to the OCR engine of the extension and all the text content of it will be extracted. The process of text extraction will be displayed in a popup like a floating window on each page. If you have multiple jobs, you will get multiple floating windows. The OCR engine of this extension is Tesseract.js which supports more than 100 languages and is written purely in JavaScript language. Note that on the first usage, the extension fetches the proper language database from the server, though on future use, since your browser has already cached this resource, the OCR process takes shorter. The progress of both the fetching and the OCR extraction is displayed in the popup window.

Features

FAQs

  1. What is the "OCR - Image Reader" add-on and how can I use it?

    This extension is about having a simple yet powerful in-page OCR application on hand without installing a native one. The extension is pretty simple to use. Whenever you need to extract the content of an image or a text that cannot be selected, simply press the toolbar button to switch to the area selection mode. Now hold your mouse down and select the area of interest, then release the mouse pointer. At this point, the extension captures the screen area and send this image to the OCR engine. This process happens inside the page in a frame element. You can see the progress of the entire process in a popup window. The first step is to fetch the language training database from the server, then extract the text content out of the image area. Both of these processes have a progress bar. The fetch step might take some time on the first run, but should be fast for all the subsequent calls as your browser should have already cached the resource once.

  2. recommended "Spoof Geolocation" extension for Chrome, Edge and Firefox browsers

    This extension alters the reported GEO location by your browser. You can provide your custom latitude and longitude to any website to improve privacy or get localized data from a custom location. This extension is also useful if you have a SOCKS proxy on your browser to have a consistent IP address with the GEO data. Read more here.

  3. What's new in this version?

    Please check the Logs section.

  4. Does this extension uses an online service for dong the text recognition?

    No the process of extracting the text content out of the image all happens locally. However, note that this extension gets the training data from a remote server since the database is about 30Mbytes and cannot be packed with the extension itself. This extension does not interact with any remote services at all except the database fetching part.

  5. Can I send a very large image to the OCR engine?

    Theoretically, you can send large images too, but it is going to take a long time for the extension to process the image. It might even require too much CPU resources to be able to extract the content. It is recommended to use the area selection tool properly to only select the required area instead of having a large image which large empty area around.

  6. What is the OCR engine of this extension?

    This extension uses the powerful Tesseract.js with online language training resources to have the latest database

  7. When I am trying to use this extension on local images, I get the "Cannot access contents of the page. Extension manifest must request permission to access the respective host" notification. Is there anyway to use this extension on local images?

    You need to create a local server and access images with the "http://127.0.0.1/..." address to be able to use this extension. On modern browsers, no extension can access the FILE scheme. So we are going to access local files with the standard HTTP scheme similar to normal web pages. Read How do you set up a local testing server documentation to set up a local server.

  8. I usually perform OCR on documents in different languages. It would be helpful if this extension could detect the language of the image and use the proper engine. Is this possible?

    As of version 0.2.3, it is possible to change the language to auto-detect. If this mode is selected, the extension performs the initial OCR on the image in three different languages (English, Arabic, and Japanese), and uses the output text to detect the content's language using the Compact Language Detector (CLD) algorithm. On successful language detection, the extension uses it to perform the actual OCR. Note that since the extension needs to grab the trained engine for multiple languages, the first detection is slow.

  9. Is it possible to post the OCR result to a server?

    As of version 0.2.4, you can define a custom server to post the result. A use case is to copy the data to a local text file without a manual copy and paste process. To configure the server, use Shift + click on the "Post Result" button. General Format:

    [GET|POST|PUT]|URL|[POST|PUT body]
    Post Example:
    POST|http://127.0.0.1:8080|&content;
    POST|http://127.0.0.1:8080|{"body":"&content"};
    Put Example:
    PUT|http://127.0.0.1:8080|&content;
    Get Example:
    GET|http://127.0.0.1:8080?data=&content;|
    Open in a Browser Tab Example:
    OPEN|http://127.0.0.1:8080?data=&content;|
    The "OPEN" command can be used to for instance search the extracted content in a new browser tab or send the content to a website that needs user interaction. The &content; keyword on the URL part will be replaced with the actual result and it is encoded (encodeURIComponent), but the &content; on the body section is not altered. You can have one instance of the &content; keyword in the URL and one instance in the body part. You can write the server code in any language such as Python, PHP, or JavaScript. Here you can find a sample code written in JavaScipt. This code is meant for NodeJS. Alter the code to fit your needs:
    const http = require('http');
    
    const server = http.createServer(function(req, res) { // 2 - creating server
      // res.setHeader('Access-Control-Allow-Origin', '*');
    
      console.log('Request URL: ' + req.url);
      console.log('Request method: ' + req.method);
    
      if (req.method === 'GET') {
        res.end();
      }
      else {
        req.on('data', chunk => {
          console.log('Chunk:', chunk);
        });
        req.on('end', () => {
          res.end();
        });
      }
    });
    
    server.listen(8080);
    Supported keywords:

    • &content; OCR result
    • &href; Document URL

  10. [Version 0.2.7] Is it possible to close all result panels at once?

    Use the Shift key while pressing the "Close" button to close all open panels on the current page

  11. Can I translate the extracted text from an image using this extension?

    You can use the "Post Result" button to send the extracted text to a translator service. For instance, to send the text to Google Translate and translate any language into English use:

    OPEN|https://translate.google.com/?sl=auto&tl=en&text=&content;|
    
    To send the text to DeepL and translate in English, use
    
    OPEN|https://www.deepl.com/translator#en/de/&content;|

  12. I like my other extensions, such as text highlighter or text to speech reader, to access the content this extension extracts. Is it possible?

    The interface of each extension is an isolated area, so other extensions cannot access it. You can use the following "post" command to open the extracted content on a new web page. This web page is accessible by all other extensions.

    OPEN|https://webbrowsertools.com/simple-text-editor/?content=&content;|

  13. This extension caches the training data to increase the detection speed. How can I remove these cached data?

    [as of version 0.3.2] Use Ctrl + Click or Command + Click on the "close" button of an OCR result box. This action removes all the cached training data. The extension fetches a new copy on the next detection request.

Matched Content

Preview

Reviews

Please keep reviews clean, avoid improper language, and do not post any personal information. Also, please consider sharing your valuable input on the official store.

What's new in this version

Version--
Published--/--/--
Change Logs:
    Last 10 commits on GitHub
    Hover over a node to see more details

    Need help?

    If you have questions about the extension, or ideas on how to improve it, please post them on the  support site. Don't forget to search through the bug reports first as most likely your question/bug report has already been reported or there is a workaround posted for it.

    Open IssuesIssuesForks

    Permissions are explained

    PermissionDescription
    storageto keep the internal preferences
    activeTabto inject area select script into the active page after a user action
    notificationsto display possible warnings during the OCR process

    Recent Blog Posts on add0n.com