OCR training on Chineses Character

William1225 · ‎03-10-2014

Dear all,

I'd like to recognize the chinese characters via OCR toolkit,

As for the english and number characters, I could train several chacters at once, say "abc123".

When I, however, train several chinese characters, it will come out incorrect character number warning popup.

In additional, it can name a character to be multiple chinese characters.

Does it mean that I must train chinese character one by one, never all characters?

The chinese characters is are not "effective" characters for LabVIEW OCR toolkit to process individully, and it can only treat them as a whole to process.

Could someone give me some suggestions? Thanks a lot.

BruceAmmons · ‎03-11-2014

I believe chinese characters are represented in the operating system by multiple characters, probably a two character sequence. Even though it displays as a single character, to LabVIEW it looks like you are typing two characters. It might be worthwhile to verify this by checking the string length in LabVIEW.

I think training one character at a time would work best. If you are going to be doing a lot of training, you could write a quick utility that would loop through each unknown character and request the correct character, resulting in the training of one character at a time for a large sequence of characters.

Bruce

Bruce Ammons
Ammons Engineering

William1225 · ‎03-11-2014

Hi, Bruce,

Thanks to your reply.

Yes, Chinese character is 2 bytes, and English character is 1 byte.

But the OCR toolkit does not consider the actual length of the character, but treat not alphanumeric characters as a single character.

That is what I think I can recognize an object as several chinese characters.

You can see the second image of the top post that LabVIEW get one character of interest, and I can train it by three chinese characters.(Of course not for the English character case)

I'll train one character at a time in the future, thanks again to your suggestion!

BruceAmmons · ‎03-11-2014

That is one feature of the OCR toolkit that I like - you can represent a single symbol with multiple characters. This is only possible when training one character at a time, though. I like that I could recognize a logo and replace it with the text [LOGO] or something like that.

Bruce

Bruce Ammons
Ammons Engineering

William1225 · ‎03-12-2014

I have to correct my question that it can train several Chinese characters at a time when I update my Vision version to 2013.

The question I met is under Vision 2011 version.

Sorry if others are confused.

Machine Vision

OCR training on Chineses Character

OCR training on Chineses Character

Re: OCR training on Chineses Character

Re: OCR training on Chineses Character

Re: OCR training on Chineses Character

Re: OCR training on Chineses Character