PDA

View Full Version : Handle Text File Operations - problem with Unicode files



NewsArchive
08-11-2014, 02:46 AM
Hi Friedrich,

This issue has just cost me a couple of wasted hours:

I got a text file from my customer and the Count Lines function didn't
work. With a debug message I saw that the function only returned 1 Line. I
just noticed that the cause is that the file has the Unicode coding. When I
saved the text file in Notepad as ANSI, the Count Lines function and other
Text File Operations Functions are working well.

So my questions are:

1. how can I detect if the text file coding is Unicode?

2. And if the coding is Unicode, how can I save the text file as ANSI, so
that the SetupBuilder Text File Operations Functions are working well.

This is a big issue for us and I hope there is a solution. Using the latest
SB.

--
Best regards,
Jeffrey

NewsArchive
08-11-2014, 02:46 AM
Hi Jeffrey,

UNICODE is not supported. The solution is to write your own DLL to detect
it as a UNICODE file and then convert the text file from UNICODE to ASCII.
Just call the DLL function from within your installer.

Friedrich

NewsArchive
08-11-2014, 02:46 AM
BTW, you can try to "detect" Unicode files by looking for "BOM" (EF BB BF).
Most Windows programs that write Unicode files emit BOMs (but don't rely on
it).

And there is a Windows API:

http://msdn.microsoft.com/en-us/library/dd318672.aspx

Friedrich

NewsArchive
08-11-2014, 02:47 AM
Hi Friedrich,

That's a real pity.
I have just seen that your PUTINI function is working ok with Unicode
files, but the Text File Operations Functions don't. My point is that I
want to handle text files as ANSI so that I can use your Text File
Functions. But sometimes my customers are sending me Unicode text files
which leads to errors when I (or they) run my SB script.

Best regards
Jeffrey

NewsArchive
08-11-2014, 02:47 AM
Jeffrey,

No, that is not correct. The PUTINI function is a wrapper around the
WritePrivateProfileStringA (ANSI) Windows API and not
WritePrivateProfileStringW (Unicode).

And you can't process an Unicode file from within an (ASCII) Text File
Operation. BTW, just for fun, try to read your Unicode file from your own
ASCII application and display its contents. Then you'll see what I mean ;-)
ANSI and UNICODE are completely different animals.

BTW, this is not a SB issue at all! To solve your issue just write your own
small DLL that can detect and convert UNICODE text files into ASCII. The
IsTextUnicode() Windows API is what Notepad uses to differentiate between
Unicode and ANSI/UTF8. It looks for the BOM header in a text file.

Friedrich

NewsArchive
08-11-2014, 02:47 AM
Perhaps you can use the following command line tool:

TxtU2A.zip -- TxtU2A is a Windows console program that quickly converts text
file from UNICODE to ASCII.

http://www.softspecialists.com/download.aspx

If it is allowed to redistribute this tool, you can call it from your
installer if you have detected an UNICODE text file.

Friedrich

NewsArchive
08-11-2014, 02:48 AM
Hi Friedrich,

Thanks!
This code in a batch file also does the trick:

CHCP 1252
TYPE unicode.txt > ansi.txt

See:
http://www.robvanderwoude.com/type.php#Unicode

Can I use batch files with SB?

Best regards
Jeffrey

NewsArchive
08-11-2014, 02:48 AM
Hi Jeffrey,

> Can I use batch files with SB?

Yes, batch files are absolutely no problem.

Friedrich

NewsArchive
08-11-2014, 02:48 AM
A checkbox that said "Auto-detect unicode and convert to ansi prior to
reading" would be nifty. :)

>BTW, this is not a SB issue at all!

Jeff Slarve
www.jssoftware.com
www.twitter.com/jslarve
I'll search help files & Google for you.

NewsArchive
08-11-2014, 02:49 AM
+1000 ;-)
That would save me a lot of problems.

Best regards,
Jeffrey

NewsArchive
08-22-2014, 01:42 AM
Jeffrey,

> +1000 ;-)
> That would save me a lot of problems.

Would it be possible to send one or two of your Unicode files to support
[at] lindersoft [dot] com?

Thanks,
Friedrich