aMule Forum

English => Feature requests => Topic started by: pochu on March 06, 2007, 01:59:05 PM

Title: Charset Support
Post by: pochu on March 06, 2007, 01:59:05 PM
Hello everybody!

This is my first message in this community, so be polite with me! :D

I have discussed this with Kry on IRC. What I am asking for is for all charset support. The problem is the next:
Actually, when a file is downloaded, it is created in Incoming/ as ISO-8859-1

This is the Ubuntu bug report:
https://launchpad.net/ubuntu/+source/amule/+bug/40238

Talking with Kry:
<Kry>   static wxCSConv aMuleConv(wxT("iso8859-1"));
<Kry>   tht's on StringFunctions.h

...

<Kry>   ok
<Kry>   change the line you changed in your build
<Kry>   to
<Kry>   static wxCSConv aMuleConv(wxConvLocal);
<Kry>   and try then.

I tried that, and it worked really fine. So this is my suggestion:
Can this be included in the CVS source tree, so it is included in aMule 2.2.0?

If you want me to test it, I will do it :)

Thanks for your work
Pochu
Title: Re: Charset Support
Post by: phoenix on March 07, 2007, 12:34:38 PM
Hi Pochu,

Before I do that, I want you to do a test for me. Using this patch, create a folder whose name is a valid iso-8859-1 string and try to share it with aMule. Please report the result.

Title: Re: Charset Support
Post by: Kry on March 07, 2007, 12:48:46 PM
I didn't say I aprove of the change, just FYI. It has to be carefuly considered.
Title: Re: Charset Support
Post by: pochu on March 07, 2007, 07:42:01 PM
Kry, I haven't said you approved it ;) but I would like you to consider it :D

Hi Pochu,

Before I do that, I want you to do a test for me. Using this patch, create a folder whose name is a valid iso-8859-1 string and try to share it with aMule. Please report the result.

Phoenix: my system is entirely utf-8, so I don't know how to create an iso folder (if that's what you want me to do). Or do you want me to create a folder with a valid iso name (such us "valid" hehe) and test it?

I tested with the normal Incoming/ folder, but I can test with whatever you need. But please, specify it :D

Thanks
Emilio
Title: Re: Charset Support
Post by: phoenix on March 09, 2007, 05:28:01 AM
Hi Emilio,

Do something like that:

$ mkdir $'espa\xf1a'

then try to share this directory.
Title: Re: Charset Support
Post by: pochu on March 09, 2007, 10:20:39 AM
Hello.

I've tried that.

First, the folder, in Nautilus (gnome) is displayed as espa?a
I have compiled amule cvs with that change, and with wx2.8.1.

I can't share the folder, because it isn't displayed. Well, it is displayed, but it hasn't a name. And also, I've created a subdirectory inside espa?a called "testing", to see if I could share it, and this is what happens (attachment).

Please tell me if I can do anything else to get this working.

Best regards
Emilio
Title: Re: Charset Support
Post by: phoenix on March 10, 2007, 01:12:06 AM
Hi pochu,

Please try this patch and report.
Title: Re: Charset Support
Post by: pochu on March 10, 2007, 01:36:30 AM
You ROCK!
Title: Re: Charset Support
Post by: phoenix on March 14, 2007, 04:27:36 AM
New version of the patch. One less bug  ;D
Title: Re: Charset Support
Post by: phoenix on March 17, 2007, 05:14:44 AM
Last version of the patch. This one is against current CVS (tomorrow). Has many bug fixes and has been tested for a week.
Please test.
Title: Re: Charset Support
Post by: phoenix on March 19, 2007, 11:44:44 PM
Well, it just does not work always. wxStat is foobarred. I'll have to try a work around, which is to use the present code as a fallback.  :(
Title: Re: Charset Support
Post by: Kry on March 20, 2007, 12:30:52 AM
TOLD YOU
TOLD YOU
TOLD YOU x many times (cut for readability)
Title: Re: Charset Support
Post by: phoenix on March 20, 2007, 11:32:24 AM
Lol!  ;D

But I found another way, just wait... And this time I will use no fallback. ::)
Title: Re: Charset Support
Post by: phoenix on March 21, 2007, 12:47:31 AM
Pochu, please try this one! Notice that you HAVE to run ./autogen.sh, because this patch changes one Makefile.am and your configure script will be wrong.

Kry, this one is definitive  8) And fixes wxStat behaviour.
Title: Re: Charset Support
Post by: Kry on March 21, 2007, 02:25:43 PM
We'll see about that
Title: Re: Charset Support
Post by: phoenix on March 27, 2007, 12:05:54 PM
pochu,

Today's cvs tarball has the last patch. Please report if your issue is gone.

Cheers!
Title: Re: Charset Support
Post by: pochu on March 28, 2007, 06:44:07 PM
Hi phoenix

I've built today's tarball, and have downloaded three files with special characters. It hasn't worked.
I attach a screenshot of ~/.aMule/Incoming/ . One image is better than a thousand words ;)

I have nothing more to say atm, will try to investigate this further.

Regards
Pochu
Title: Re: Charset Support
Post by: phoenix on April 21, 2007, 09:30:17 PM
Hi Pochu,

This image you submitted is from a system file browser or something like this, right? This is not from aMule.

The situation is the following: if the file name you have in your directory uses a character set that is not the same as in your system. Your system uses UTF-8. It will always read garbage when you have, for instance, ISO-8859-1file names, which seems to be the case.

Well, we have to find a reasonable policy here. At the present time, aMule works like this:

1) Converting from multibyte to UNICODE:
- Assume that input name is ISO-8859-1 and try to convert it to UNICODE. If this fails, then try to convert from UTF-8 to UNICODE.

2) Converting from UNICODE to multibyte:
- Try to convert UNICODE input to ISO-8859-1. If this fails, convert it to UTF-8.

I have been thinking, and maybe step 1 is wrong. ISO-8859-1 to UNICODE must never fail, while UTF-8 to UNICODE can fail. Maybe the right order is the opposite. Still there could be a situation where the file name is ISO-8859-1 but by chance it was a valid UTF-8 sequence.

What we can experiment with is not use ISO-8859-1file names and using the system encoding instead.
Title: Re: Charset Support
Post by: phoenix on April 23, 2007, 04:07:10 PM
Tomorrow CVS code will have a different behaviour for #1 above. Now we first try UTF-8 when converting to UNICODE. If this fails, then we try ISO-8859-1.

Cheers!
Title: Re: Charset Support
Post by: pochu on April 23, 2007, 04:19:13 PM
Hi phoenix

Firstly, sorry for not replying to the previous message, I forgot to do it!

Yes, that screenshot is the gnome file manager, nautilus. As you can see, that's the amule Incoming folder, and those are some files I downloaded to test your changes. But it didn't worked.

I'll try tomorrow's tarball, let's see if we can finalize this! (/me testing, you coding ;) )

Best regards
Pochu
Title: Re: Charset Support
Post by: phoenix on April 23, 2007, 04:25:13 PM
Hi Pochu,

It is not a matter of working or not working. Maybe you did not understand. It cannot work always, that is the problem.

We could add an option to use the system encoding instead of ISO-8859-1... I don't know, I would like to hear other people's oppinion on that.

Cheers!
Title: Re: Charset Support
Post by: pochu on April 25, 2007, 02:08:01 PM
Hi phoenix:

With today's tarball, it still fails with the "ñ" in the files. In the Incoming folder, that char is displayed as "?".

Maybe that option of using the system encoding will be great :)

I've also seen that the next wx release will have UTF-8 support (http://www.wxwidgets.org/wiki/index.php/Development/UTF8)

Maybe with that this will be easier?
Title: Re: Charset Support
Post by: Kry on April 25, 2007, 02:33:46 PM
What, you mean along with the fact that they broke compilation?
Title: Re: Charset Support
Post by: pochu on April 25, 2007, 02:59:25 PM
Kry: the compilation fails with wx2.9, but not with 2.8.x (I've built now against 2.8.4-rc1).

This also failed with previous amule cvs and wx2.8.x (2.8.3.0, e.g.)
Title: Re: Charset Support
Post by: Kry on April 25, 2007, 03:53:24 PM
UTF8 is afaik only in 2.9
Title: Re: Charset Support
Post by: pochu on April 25, 2007, 03:57:22 PM
In wx yes, this can be solved in amule
Title: Re: Charset Support
Post by: Kry on April 25, 2007, 04:15:41 PM
?
Title: Re: Charset Support
Post by: pochu on April 25, 2007, 04:22:36 PM
I mean the charset issue :)

Phoenix got it solved some time ago (with the first or the second patch) but it shouldn't be the first way to do it, and now it doesn't work.

Sorry if I misunderstood your words!
Title: Re: Charset Support
Post by: phoenix on April 26, 2007, 05:39:30 AM
Pochu,

I have never got it "solved" as you say. If you want to try a system encoding, please, change this line:
In file src/libs/common/ConvAmule.h, line53:
   ConvAmuleBrokenFileNames aMuleConvBrokenFileNames(wxT("ISO-8859-1"));
Change to:
   ConvAmuleBrokenFileNames aMuleConvBrokenFileNames(wxConvLocal);

That is what Kry said to you in the first place. We might add this to the preferences or even better, start checking the LANG environment variable if this becomes an issue.

Cheers!
Title: Re: Charset Support
Post by: wuischke on April 29, 2007, 10:05:25 AM
Maybe a related issue: http://forum.amule.org/index.php?topic=12542.0