I'm using Windows.Media.OCR engine to scan these two lines
But the software scan them like that:
While I'm expecting it to scan like:
KIBA/USDT 0.00003826 6.31M KIBA 241.68459400 USDT
KIBA/USDT 0.00003470 17.13M KIBA 594.48387000 USDT
The code I'm using is:
'require references: "C:\Program Files (x86)\Windows Kits\10\UnionMetadata\Windows.winmd"
'"C:\ProgramFiles(x86)\ReferenceAssemblies\Microsoft\Framework.NETCore\v4.5\System.Runtime.WindowsRuntime.dll"
' and windows 10 sdk
Imports Windows.Media.Ocr
Imports System.IO
Imports System.Runtime.InteropServices.WindowsRuntime
Public Class Form1
Private Async Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
Dim softwareBmp As Windows.Graphics.Imaging.SoftwareBitmap
Using bmp As Bitmap = New Bitmap(PictureBox1.Width, PictureBox1.Height)
Using g As Graphics = Graphics.FromImage(bmp)
Dim pt As Point = Me.PointToScreen(New Point(PictureBox1.Left, PictureBox1.Top))
g.CopyFromScreen(pt.X, pt.Y, 0, 0, bmp.Size, CopyPixelOperation.SourceCopy)
Using memStream = New Windows.Storage.Streams.InMemoryRandomAccessStream()
bmp.Save(memStream.AsStream(), System.Drawing.Imaging.ImageFormat.Bmp)
Dim decoder As Windows.Graphics.Imaging.BitmapDecoder = Await Windows.Graphics.Imaging.BitmapDecoder.CreateAsync(memStream)
softwareBmp = Await decoder.GetSoftwareBitmapAsync()
End Using
End Using
End Using
Dim ocrEng = OcrEngine.TryCreateFromLanguage(New Windows.Globalization.Language("en-US"))
Dim languages As IReadOnlyList(Of Windows.Globalization.Language) = ocrEng.AvailableRecognizerLanguages
For Each language In languages
Console.WriteLine(language.LanguageTag)
Next
Dim r = ocrEng.RecognizerLanguage
Dim n = ocrEng.MaxImageDimension
Dim ocrResult = Await ocrEng.RecognizeAsync(softwareBmp)
RichTextBox1.Text = ocrResult.Text
End Sub
End Class
Which kind of change does this code needs in order to scan by row and not by column?
but I didn't post it before cause I anyway I will need to scan only from 0.000038 ecc to 0.0000%
CodePudding user response:
I chose to act on the output string instead of tackling the OCR API. Fixing the issue within the OCR API would probably be a superior solution if possible, but I could not get your code properly referenced in my system.
So you can add this function to transpose the string
Private Function transpose(input As String) As String
Dim numberOfColumns = 4 ' this must be known and could be a parameter to this function
Dim fixedInput = input.Replace(" KIBA", "|KIBA").Replace(" USDT", "|USDT")
Dim splitInput = fixedInput.Split(" "c)
Dim numberOfWords = splitInput.Count()
Dim numberOfRows = numberOfWords / numberOfColumns
Dim words As New List(Of String)()
For row = 0 To numberOfRows - 1
For col = 0 To numberOfColumns - 1
words.Add(splitInput(CInt(row numberOfRows * col)))
Next
Next
Dim sb As New System.Text.StringBuilder()
For i = 0 To words.Count() - 1
sb.Append(words(i).Replace("|", " "))
If (i <> words.Count() - 1) Then
sb.Append(If((i 1) Mod numberOfColumns = 0, Environment.NewLine, vbTab))
End If
Next
Return sb.ToString()
End Function
Simply pass your ocr output string through it. Here it is called in your code
Private Async Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
Dim softwareBmp As Windows.Graphics.Imaging.SoftwareBitmap
Using bmp As Bitmap = New Bitmap(PictureBox1.Width, PictureBox1.Height)
Using g As Graphics = Graphics.FromImage(bmp)
Dim pt As Point = Me.PointToScreen(New Point(PictureBox1.Left, PictureBox1.Top))
g.CopyFromScreen(pt.X, pt.Y, 0, 0, bmp.Size, CopyPixelOperation.SourceCopy)
Using memStream = New Windows.Storage.Streams.InMemoryRandomAccessStream()
bmp.Save(memStream.AsStream(), System.Drawing.Imaging.ImageFormat.Bmp)
Dim decoder As Windows.Graphics.Imaging.BitmapDecoder = Await Windows.Graphics.Imaging.BitmapDecoder.CreateAsync(memStream)
softwareBmp = Await decoder.GetSoftwareBitmapAsync()
End Using
End Using
End Using
Dim ocrEng = OcrEngine.TryCreateFromLanguage(New Windows.Globalization.Language("en-US"))
Dim languages As IReadOnlyList(Of Windows.Globalization.Language) = ocrEng.AvailableRecognizerLanguages
For Each language In languages
Console.WriteLine(language.LanguageTag)
Next
Dim r = ocrEng.RecognizerLanguage
Dim n = ocrEng.MaxImageDimension
Dim ocrResult = Await ocrEng.RecognizeAsync(softwareBmp)
RichTextBox1.Text = transpose(ocrResult.Text)
End Sub
I tested with this function
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
Dim input = "0.00003599 0.00003599 104.1K KIBA 23.22M KIBA 3.74655900 USDT 835.89654200 USDT 0.0000% 0.0000%"
Dim output = transpose(input)
End Sub
Input:
0.00003599 0.00003599 104.1K KIBA 23.22M KIBA 3.74655900 USDT 835.89654200 USDT 0.0000% 0.0000%
Output:
0.00003599 104.1K KIBA 3.74655900 USDT 0.0000%
0.00003599 23.22M KIBA 835.89654200 USDT 0.0000%
Note you need to fix your string to temporarily replace any sentence with multiple words by replacing the space with a pipe | so they are not split, and if you encounter more examples of this you can continue adding Replace according to the code. If the pipe turns out to be a valid character replace it with some other character you will never see.
Dim fixedInput = input.Replace(" KIBA", "|KIBA").Replace(" USDT", "|USDT")
...
sb.Append(words(i).Replace("|", " "))



