Home > Mobile >  Screen optical character recognition
Screen optical character recognition

Time:01-04

I'm trying to parse numbersmostly and some letters from the screen. It's plenty of topics online but 99% are all dated and old and all of them are asking how to get text from an image ( which I dont need). In my case, I want simply a transparent picturebox that will scan whatever text there is under. I found a quite old youtube tutorial

and the code that is written on the bio is


Imports Emgu.CV
Imports Emgu.Util
Imports Emgu.CV.OCR
Imports Emgu.CV.Structure
   
Public Class Form1
   
Dim OCRz As Tesseract = New Tesseract("tessdata", "eng", Tesseract.OcrEngineMode.OEM_TESSERACT_ONLY)
Dim pic As Bitmap = New Bitmap(270, 100)
Dim gfx As Graphics = Graphics.FromImage(pic)
   
Private Sub Timer1_Tick(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Timer1.Tick
   
'If Windows XP
gfx.CopyFromScreen(New Point(Me.Location.X   PictureBox1.Location.X   4, Me.Location.Y   PictureBox1.Location.Y   30), New Point(0, 0), pic.Size)
PictureBox1.Image = pic
   
'If Windows 7
'gfx.CopyFromScreen(MousePositi­on, New Point(0, 0), pic.Size)
   
End Sub
   
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
   
OCRz.Recognize(New Image(Of Bgr, Byte)(pic))
RichTextBox1.Text = OCRz.GetText
   
End Sub
End Class

It is a 8 y old code so I think there must be something better. I downloaded emgu.cv from nuget, the first and most downloaded package but at runtime I get the error "'Unable to create ocr model using Path 'tessdata' and language 'eng'.'" At compile time I get the error "'Tesseract' is not defined.". That's really frustrating cause I've been looking everywhere, also in c# forums but none that can help me. Do you have any solution? I would appreciate your help. Thanks

CodePudding user response:

That code is quite old and I feel It wouldn't be working properly. Have you considered Windows.Media.OCR ? Windows v.10 SDK required.

Controls used:

  • 1 picturebox transparent ( Use as same backcolor as form transparentkey, I'm using grey)
  • 1 richtextbox
  • 1 button

Add as a reference:

"C:\Program Files (x86)\Windows Kits\10\UnionMetadata\Windows.winmd"

"C:\ProgramFiles(x86)\ReferenceAssemblies\Microsoft\Framework.NETCore\v4.5\System.Runtime.WindowsRuntime.dll"

Imports Windows.Media.Ocr
Imports System.IO
Imports System.Runtime.InteropServices.WindowsRuntime
Public Class Form1
    Private Async Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
        Dim softwareBmp As Windows.Graphics.Imaging.SoftwareBitmap
        Using bmp As Bitmap = New Bitmap(PictureBox1.Width, PictureBox1.Height)
            Using g As Graphics = Graphics.FromImage(bmp)
                Dim pt As Point = Me.PointToScreen(New Point(PictureBox1.Left, PictureBox1.Top))
                g.CopyFromScreen(pt.X, pt.Y, 0, 0, bmp.Size, CopyPixelOperation.SourceCopy)
                Using memStream = New Windows.Storage.Streams.InMemoryRandomAccessStream()
                    bmp.Save(memStream.AsStream(), System.Drawing.Imaging.ImageFormat.Bmp)
                    Dim decoder As Windows.Graphics.Imaging.BitmapDecoder = Await Windows.Graphics.Imaging.BitmapDecoder.CreateAsync(memStream)
                    softwareBmp = Await decoder.GetSoftwareBitmapAsync()
                End Using
            End Using
        End Using

        Dim ocrEng = OcrEngine.TryCreateFromUserProfileLanguages()
        'If you want to scan only letters, set the language
        'Dim ocrEng = OcrEngine.TryCreateFromLanguage(New Windows.Globalization.Language("en-US"))

        
        Dim languages As IReadOnlyList(Of Windows.Globalization.Language) = ocrEng.AvailableRecognizerLanguages
        For Each language In languages
            Console.WriteLine(language.LanguageTag)
        Next
        Dim r = ocrEng.RecognizerLanguage
        Dim n = ocrEng.MaxImageDimension

        Dim ocrResult = Await ocrEng.RecognizeAsync(softwareBmp)
        RichTextBox1.Text = ocrResult.Text
        
        'Follow lines are just to test how the OCR engine has cutted the lines from the whole text
         Dim lines As IReadOnlyList(Of OcrLine) = ocrResult.Lines
        For Each line In lines
        Console.WriteLine(line.Text)
          Next
    End Sub
End Class
  •  Tags:  
  • Related