In order to fulfill my idea, and I also happened to know OpenOffice.org also has language checker, the LanguageTool.
Today, I tested JPype in order to have Java VM and package running with Python. The original Java example is:
JLanguageTool langTool = new JLanguageTool(Language.ENGLISH); langTool.activateDefaultPatternRules(); List<rulematch> matches = langTool.check("A sentence " + "with a error in the Hitchhiker's Guide tot he Galaxy"); for (RuleMatch match : matches) { System.out.println("Potential error at line " + match.getEndLine() + ", column " + match.getColumn() + ": " + match.getMessage()); System.out.println("Suggested correction: " + match.getSuggestedReplacements()); }
With the help from JPype, the equvalient code is
#!/usr/bin/python # Simple proof test of using Java package in Python from jpype import JPackage, startJVM, shutdownJVM startJVM("/opt/sun-jdk-1.6.0.13/jre/lib/amd64/server/libjvm.so", "-Djava.class.path=LanguageTool.jar") #startJVM("/path/to/libjvm.so", "-Djava.class.path=/path/to/LanguageTool.jar") LT = JPackage('de').danielnaber.languagetool langTool = LT.JLanguageTool(LT.Language.ENGLISH) langTool.activateDefaultPatternRules(); matches = langTool.check("A sentence with a error " + "in the Hitchhiker's Guide tot he Galaxy") for match in matches: print "Potential error at line ", match.getEndLine(),\ ", column ", match.getColumn(), ": ", match.getMessage() print "Suggested correction: ", match.getSuggestedReplacements() shutdownJVM()
If you want to run by yourself, you need to set up two paths, one is to the Java VM. In the example, which is for Gentoo. You can find / -name libjvm.so on Linux. The other path is to located the LanguageTool.jar, which you can find it in ZIP(oxt)1.
Actually, I dont have to load the package, LanguageTool also support web server mode, but I think I prefer to load it if I will be going to use it. I know there is another Java related language, Jython, but I never touched that before.
As for the results, they look good:
Potential error at line 0 , column 16 : Use <suggestion>an</suggestion> instead of 'a' if the following word starts with a vowel sound, e.g. 'an article', 'an hour' Suggested correction: [an] Potential error at line 0 , column 50 : Did you mean <suggestion>to the</suggestion>? Suggested correction: [to the]
However, the process time is quite long. On my computerCore 2 Duo, 1.83G, it took around 300ms to 500ms.
I have not tried to find other checkers, maybe I will find one and also a Python one?
[1] | http://sourceforge.net/project/showfiles.php?group_id=27298&g=1 is gone. |
0 comments:
Post a Comment