Wikipedia for mDict - May 2009
Here are the updated Wikipedia for mDict.
Languages:
Changes:
Since the Danish wikipedia is so small, here is a direct download link for the Danish Wikipedia.
If you do not have access to BitTorrent download, here is a mini version of the English Wikipedia (155.000 articles, very abbreviated).
If you are interested in helping doing further updates, please contact me.
Languages:
- English
- German
- Spanish
- Portuguese
- Russian
- Chinese
- Japanese
- Polish
- Italian
- Danish (direct download below)
Changes:
- Title added to all pages.
- Spanish version added.
- Chinese version added.
- English "Jumbo" edition added, with 2.3 million full articles.
Since the Danish wikipedia is so small, here is a direct download link for the Danish Wikipedia.
If you do not have access to BitTorrent download, here is a mini version of the English Wikipedia (155.000 articles, very abbreviated).
If you are interested in helping doing further updates, please contact me.
22 Comments:
I would like to see a Swedish version. Is that possible and what do i have to do? Step by step instructions would be great!
By Niklas, at 8:50 pm
I've created a copy of the greek Wikipedia (dated 12-Jul-09). It has all 46202 articles and it's only 62MB.
http://rapidshare.com/files/256984374/el-wiki-full-20090712.rar
Thanks for your software, my friend!!!
By NiTroGen, at 5:49 pm
Swedish wikipedia dumped by me.
~300 000 articles. 204MB
http://downloads.hemkoll.nu/sv-wikipedia-20090712.mdx
Converting 20090725 right now.
I see many "skipped" articles. Why? I have set --minlinks 0
By Niklas, at 5:23 pm
Swedish dump 2009-07-25: http://download.hemkoll.nu/sv-wikipedia-20090725.mdx 209MB
By Niklas, at 12:41 am
@Niklas: Glas to hear it worked out! "Articles" that has a colon in their name are automatically skipped, as they do not contain anything useful. That is stuff like images, categories and similar autogenerated stuff.
By Klaus Post, at 7:17 pm
Thanks Klaus. Could you please post the lincense that wikipedia uses in html-format. I see that you use that in the dictionary information. Im a bit confused. It seems like Wikipedia uses two different licenses now. I know that they are changing the license but it seems like the old and new are used at the same time now? Best regards!
By Niklas (niklas@hemkoll.nu), at 1:17 pm
Thanks very much for this! I grabbed the small English one, the Japanese and the Chinese versions. However, the jumbo one, I only seem to be able to get 58% and it's been stuck there for a week now. Any suggestions?
By Ben, at 2:22 pm
@Ben: You might want to try again, possibly using another client. There are usually about 10 seeds up - myself included, which should make it possible for you to download it all.
@Niklas: Use the PC version of the mDict client, open a dictionary and RightClick+View Source.
By Klaus Post, at 2:30 pm
I have tried the "jumbo" version on my pda (XDA Zest) using MDict 3.0 and I get the error: "Open dictionary failed". Has anyone had this working using the PPC WM2003+ version of MDict? I'm pretty sure the dictionary file is intact and readable. Perhaps the index is too large for a 128MB device?
By Adam, at 9:28 pm
Hi Klaus.
The "Wikipedia for mDict - May 2009" in Japanese link seem to die. Could you fix it and it's so kind if you send new link to me by mail. adr: sieucrazy@gmail.com. I desire it for my study. Thank!
By Unknown, at 5:39 pm
@Tron: I just rechecked the link, and it works fine. I'm also seeding it myself, so it shouldn't be a problem to download it. Are you sure your Torrent program i working correctly?
By Klaus Post, at 6:33 pm
Klaus,
Sorry to my bad english... i'm brazilian. I have the version portuguese - May/2009.
When you will make a new version of the wikipedia in mdx?
Thanks,
Hugo
hugo.ferreira@ibest.com.br
By Anonymous, at 12:31 pm
I expect to create a new version when a 2010 dump has been created. Wikipedia is still doing the dump they started the 1st of december. So with a month or two the next should be ready.
By Klaus Post, at 12:36 pm
Looking forward to a new version !
Any chance of a version with illustrations? I wouldn't mind using an SD card just for Wikipedia if it came to it personally.
By Anonymous, at 6:29 am
Hi Klaus. Any news about the new 2010 version of wikipedia - mdx?
Grateful
Hugo
By Anonymous, at 4:32 am
Great work Sh0dan!
I also would like to participate in this work..
btw how did you manage to create an MDX file larger than 2 GB, (I have seen this in your legaltorrent Wikipedia English Jumbo 2.7 GB)
If I'm not mistaken, you've said that (in your other blog post) that the source size for MDX builder is just limited to 2GB
Thanks,
By Dre, at 6:53 am
Hello Kaus,
How are you?
I wonder if you will still make the new version of Wikipedia (MDX) in Portuguese. We of the Brazilian community, we eagerly await the release of this new version. We use the very wikipedia offline to study in our schools because we have no connection to the Internet and libraries have few books.
Access your weekly blog hoping to find a link to download the 2010 version of wikipedia mdx in portuguese.
For further information, hopeful look in the e-mail: hugof@hospitalalianca.com.br.
Sorry for my bad English.
Grateful,
Hugo Carneiro
By Anonymous, at 3:23 pm
Hi ShOdan,
Really appreiciate your work. Thanks.
Have a Windows PPC (Xperia X1) where it all seems to work fine. Don't intend to use Mdict (ver 3.1) on this phone (various reasons) and was only for testing/troubleshooting purposes.
Actually intend to use it on a Smartphone with Winmo6.5 (Samsung Omnia Pro B7320). Am using MDict 3.2 (WinMo 6.5 compliant) and pretty much like Adam, I get the following message:
"Open dictionary failed: \Storage Card \en-wiki-jumbo.mdx, Fail to read file"
Both memory cards are 4GB SD HC cards with FAT32 file system formatting.
Now since it is the same file copied to both cards, there should not be a problem with the file itself.
Please advise on how to resolve the issue. (on the Smartphone)
Thanks
(in case anyone else wants to suggest what to do pls mail at got2log@ gmail )
By Unknown, at 11:54 pm
Thanks for sharing the source code. An update, based on your work, for October 2010 version is available in: http://ahuv.net/wikipedia
Links to torrents:
English: 3,483,000 items - Here
Spanish: 1,361,000 items - Here
French: 1,239,000 items - Here
German: 1,033,000 items - Here
Russian: 720,000 items - Here
Portuguese: 681,000 items - Here
Hebrew: 111,000 items
Arabic: 183,000 items
Persian: 109,000 items
By Or, at 10:40 am
hello,
thanks, and I want to ask befor downloading:
1-is the articlesin medium en-wiki
are in complete length ?
2-also in en- medium wikipedia ,
On what basis was the selection of articles?
I hope that is not randomly!
please tell me.
By muhajer, at 8:31 pm
muhajer:
1) No - the complete version is too large to be possible. It will not fit within the 4GB single file size limit of a file on your storage.
2) Articles are mainly selected by a google-esque algorithm, and they are ranked by the number of articles linking TO the page, and also a minimum article size.
By Klaus Post, at 7:28 am
Klaus
HELLO,
_I AM MUHAJER BUT I FORGOT MY PASSPORT FOR THAT I COMMET AS ANONYMOUS_
THANK YOU FOR REPLY,
IS IT IN 5000 BYTES PER ARTICLE
IT IS VERY SMALL
WHILE
MEDIUM ONE IS 1\4 OF THE BIG ONE
AND
IT IS 1\4 OF THE BIG ONE SIZE
FOR THAT I THINK IT MUST BE IN COMPLETE ARTICLE SIZE NOT IN JUST 5000 BYTES (1000 WORDS) PER ARTICLE!!
WHER THE SIZE (800 MB) GOES THEN ?!
I MEAN THAT EVERY ARTICLES (IN MEDIUM ONE) IN ITS COMPLETE LENGTH,
REGARDS
By Anonymous, at 9:29 pm
Post a Comment
<< Home