Text Recognition Web App In Javascript | letsbug

     Today we are making a text recognition web app with javascript. So in this app what we will do is that when you open it you will see it asks you for camera permission so you allow it. Then you see you face on the screen. 

    Now you hold a book or something which has some text on it. And when the text is perfectly aligned on the screen you can click anywhere on the screen. Then it will scan it and display the text on the screen and it will also speak those text for you.

   Now to make it happen we need some special thing like  'tesseract.js'. Tesseract js is a javascript library that will help us make this work. Basically it's the backbone of the app. And to make things a little bit simpler we are using vite to make this javascript app.

    Vite is a javascript module bundler like webpack if you have used it then you can say vite is faster alternative to webpack.

Text Recognition App In Javascript

    Now following these steps you can make this app.

Step 1: Create a vite app.

        First you have create a vite app you can do this by going into the terminal to windows powershell and typing the following command in. It will create a vite app for you. Remember you can even create a vite vanilla app or with any framework.

$ npm create vite <name-of-app>

Step 2: install tesseract.js

    Now that we have our app ready you can install the tesseract.js library from npm by 

$ npm install tesseract.js

Step 3: Add Javascript

    Now in main.js of your javascript file add the following code for the app.

    code :

import './style.css'

// Wait until the DOM content is loaded
document.querySelector('#app').innerHTML = `
  <h1>Text Recognition</h1>
  <video width="400" height="300"></video>
  <p>click on window to take picture</p>
  <pre id="result"></pre>
`
// get the video element and the result element
const video = document.querySelector('video')
const result = document.querySelector('#result')

// get the createWorker function from tesseract.js
import {
  createWorker
} from 'tesseract.js'
// Initialize the worker
const worker = createWorker()
// just a helper for setting up
function setup() {
  await worker.load() // load the model
  await worker.loadLanguage('eng') // load the language
  await worker.initialize('eng') // initialize the language

  //check if the browser has the camera
  try {
    // get the stream from the camera
    const stream = await navigator.mediaDevices.getUserMedia({
      video: true,
      muted: true
    })
    video.srcObject = stream // set the video element source
    video.play() // play the video

    // take a picture on any click event
    window.addEventListener('click', async () => {
      const canvas = document.createElement('canvas') // create a canvas element
      // set the canvas size to the video size
      canvas.width = video.videoWidth
      canvas.height = video.videoHeight
      const ctx = canvas.getContext('2d') // get the context
      // draw the video frame to the canvas
      ctx.drawImage(video, 0, 0, canvas.width, canvas.height)
      // get the image data from the tessearct.js worker and set it to the result element
      const {
        data: {
          text
        }
      } = await worker.recognize(canvas)
      result.textContent = text

      // Read the text from the image
      let utterance = new SpeechSynthesisUtterance(text.replace(/\s+/g, ' '))
      // Set the voice
      utterance.voice = speechSynthesis.getVoices().filter(voice => voice.name === 'Google UK English(Enhanced)')[0]
      // Set pitch and rate
      utterance.rate = 0.7
      // Set volume
      utterance.volume = 2
      // Queue this utterance
      speechSynthesis.speak(utterance)
    })

  } catch (err) {
    alert('No camera detected')
  }
}
setup() // run the setup function

Step 4: Run 

    Now in the terminal run following command to preview you app in the browser.

$ npm run dev

That's it. Now if you want full code of this project visit this link here


Comments

Categories

Big Data Analytics Binary Search Binary Search Tree Binary To Decimal binary tree Breadth First Search Bubble sort C Programming c++ Chemical Reaction and equation class 10 class 10th Class 9 Climate Complex Numbers computer network counting sort CSS Cyber Offenses Cyber Security Cyberstalking Data Science Data Structures Decimal To Binary Development diamond pattern Digital Marketing dust of snow Economics Economics Lesson 4 Email Validation English fire and ice Food Security in India Footprints Without feet Forest And Wildlife Resources game Geography Geography lesson 6 glassmorphism Glossary Graph HackerRank Solution hindi HTML image previewer India-Size And Location Insertion Sort Internet Network Status Interview Questions Introduction to cyber crime and cyber security IT javascript tricks json to CSV converter lesson 2 lesson 1 lesson 2 Lesson 3 Lesson 6 lesson 7 Life lines of National Economy life processes Linear Search Linked List lowest common ancestor Machine Learning MCQs median in array Merge sort min and max of two numbers Moment Money and Credit My Childhood Natural Vegetation and Wildlife NCERT Network connectivity devices Network Models Network Security No Men Are foreign Node.js operator overloading P5.js PHP Physical features of India Population Prime Numbers python Quick sort R language Rain on the roof Regular Expression Resources and development reversing array saakhi science Searching Algorithm Selection sort Social Media Marketing social science Software Engineering Software Testing Sorting Algorithm Stacks staircase pattern System Concepts Text Recognition The last Leaf time converter Time Passed From A Date Todo List App Tree Trending Technologies Understanding Economic Development username and password video player Visualization water resources Wired And Wireless LAN साखी
Show more

Popular Posts

Big Data MCQs(multiple choice questions) with answers - letsbug

Digital Marketing MCQ(Multiple Choice Questions) with Answers | part 1 | letsbug

Software Engineering MCQs questions with answers - letsbug