
Sorting a Huge Unicode File use strcmp(unsigned char *, unsigned char *)
: Hi Everyone,
: I have downloaded a huge text file (in Chinese) (21Mb)
: I used to use msort to sort big files, which works perfectly
: for English Files but not in Chinese.
: This may be due to the strcmp function they used! strcmp
: takes two (char *) which is signed. But what I want is strcmp
: whichs takes two (unsigned char *)
: Now my question is where can I download a sorting program
: which works for UNICODE and can sort EXTREMELY HUGE file? Or where
: can I download the source code so that I can make it to unsigned
: char!
As someone else pointed out, there is a function (I don't remember the
name) to compare UNICODE chars. I have a module to sort huge amounts
of data with your own function, look for "bigsort.c" in
http://www.pci.uni-heidelberg.de/tc/usr/joerg/prg/testbigsort.tar.gz
It should be fairly easy to adapt the required solution from this code.
Hope that helps, Joerg
: Please Help!!!
: Yick Yan
--
\|/
------------------------------------------------oOO-(_)-OOo---------
Joerg Schoen
E-mail: Joerg.Schoen AT tc DOT pci DOT uni-heidelberg DOT de
Web-Page: http://www.pci.uni-heidelberg.de/tc/usr/joerg
--------------------------------------------------ooO-Ooo-----------