The amuleweb in aMuleCVS-20071226 does not support unicode characters in search/download properly. There are actually two problems:
1. URL decoding does not work properly under certain local charset (e.g. CP936). This is caused by the line
char toReplace[2] = {(char)i, 0}; // decode URL
m_dec_str[i] = char2unicode(toReplace);
For some charsets (e.g. CP936), if i is greater than 127, it will be considered as a two-byte character. However, {i, 0} is not a valid two-byte character. Therefore, char2unicode() will cause problem.
2. The key/value submitted to amule webserver are parsed as 8-bit charset, not a unicode charset (e.g. UTF-8). This is caused by CUrlDecodeTable::DecodeString(), where is only considers every escaped/unescaped character in the value as an 8-bit character, without further decoding.
If we can assume that all amuleweb skins use UTF-8 as encoding, the following patch fixes the above two problems.
--- amule-cvs.org/src/webserver/src/WebServer.cpp 2007-12-08 02:04:25.000000000 +0800
+++ amule-cvs/src/webserver/src/WebServer.cpp 2007-12-27 11:05:46.669544744 +0800
@@ -145,8 +145,9 @@
snprintf(fromReplace, sizeof(fromReplace), "%%%02X", i);
m_enc_u_str[i] = char2unicode(fromReplace);
- char toReplace[2] = {(char)i, 0}; // decode URL
- m_dec_str[i] = char2unicode(toReplace);
+ //char toReplace[2] = {(char)i, 0}; // decode URL
+ //m_dec_str[i] = char2unicode(toReplace);
+ m_dec_str[i] = wxString::Format(wxT("%c"), i);
}
}
@@ -157,6 +158,12 @@
str.Replace(m_enc_l_str[i], m_dec_str[i]);
str.Replace(m_enc_u_str[i], m_dec_str[i]);
}
+ char *buffer = new char[str.Len() + 1];
+ for (size_t i=0;i<str.Len();++i)
+ buffer[i] = str[i];
+ buffer[str.Len()] = 0; // Mark the end of the string
+ str = UTF82unicode(buffer);
+ delete[] buffer;
}
CUrlDecodeTable* CUrlDecodeTable::ms_instance;
It has been tested on my linux box (Fedora Core 3, arch x86_64, CP936 local charset).