OCR Training, Asian characters and UTF-16

cfoe · ‎09-12-2011

Hello everybody,

I've got a problem using the NI OCR training software delivered with the NI development module.

I try to detect korean characters but it fails.

After investigation, I found what's going on.

Everything works fine with french or english characters but the problem appears only for korean characters and maybe for other asians one.

Korean are not correctly detected after learning.

Let me explain you an example :

The korean characters can be encoded using unicode and UTF-16.

For example, the following character gives the result ¤Â in the OCR Training software, this is the wrong one.

is in unicode : U+C2A4 encoded on 16 bits using UTF-16.

The OCR training decodes it as U+00C2 corresponding to Â and U+00A4 corresponding to ¤. The two characters are encoded on 8 bits (UTF-8).

How can I setup the OCR training to encode the character set file in UTF-16 and to display the result in the string indicator in UTF-16 and not in UTF-8 ?

ChristopheC · ‎09-12-2011

Hi,

We are working on better support for multibyte characters in the OCR Training Interface. The main issue is that the tool does not allow to train multiple multibyte characters at once (or a mix of multibyte and single byte characters at once).

The workaround with the shipping product is to train each multibyte character individually, which is time consuming.

If you're interested in a beta version that solves this issue for multibyte characters, please send your request to vbai.support@gmail.com

Best regards,

Christophe

cfoe · ‎10-17-2011

Hi Christophe,

Thanks for you response.

I sent two requests to the mail address vbai.support@gmail.com but i don't have any response.

I'm really interrested about a beta release that solves the multibyte characters issue.

Let me know if it's possible to get this beta version.

Best Regards,

Cfoe

Machine Vision

OCR Training, Asian characters and UTF-16

OCR Training, Asian characters and UTF-16

Re: OCR Training, Asian characters and UTF-16

Re: OCR Training, Asian characters and UTF-16