tmalsburg / guess-language.el Goto Github PK
View Code? Open in Web Editor NEWEmacs minor mode that detects the language you're typing in. Automatically switches spell checker. Supports multiple languages per document.
Emacs minor mode that detects the language you're typing in. Automatically switches spell checker. Supports multiple languages per document.
Hi
I am long time user of ispell, flyspell (pluas own language dependend abbrev table) Usually I have 4 functions, for switching the ispell-dictionary for the for languages I use most (plus their corresponding abbrev tables) and bind this to 4 different keys. The ispell dictionaries I have are:
american.hash -> /var/lib/ispell/american.hash
british.hash -> /var/lib/ispell/british.hash
british-insane.hash -> /var/lib/ispell/british-insane.hash
castellano.hash -> espa~nol.hash
odeutsch.hash -> ogerman.hash
francais.hash -> french.hash
So I changed the setting of guess-language-langcodes
according of what is to be used for ispell-change-dictionary
which results in (just listing the differences)
(
(de "deutsch8" "German" "🇩🇪" "German")
(en "british" "English" "🇬🇧" "English")
(es "castellano8" nil "🇪🇸" "Spanish")
(fr "francais" "French" "🇫🇷" "French"))
Now I open a new file, start guess-language-mode (flyspell-mode is on) and type
Damit sind die Voraussetzungen des lokalen Existenzsatzes für
Anfangsdaten $U(T)$ gegeben und man erhält die Existenz einer Lösung
auf dem Intervall $[T,T+\epsilon]$. Aber warum ist das wahr?
Now let us look whether this true. But then what to we see
Jetzt geht es zurück nach Deutsch. Aber es passiert nichts.
The first paragraph is correctly identified as German, but the second not. Only if I restart flyspell-mode, English is chosen. Is this the expected behaviour? Because if this is not done automatically then I can hit also 2 keys, one for changing the language manually, one for restarting flyspell-mode.
What do I miss?
Regards
Uwe Brauer
The table of languages that are currently supported by guess-language-mode mentions:
Dutch nl de
Should that not be Dutch nl nl_NL?
I get the following error message:
Error in post-command-hook (flyspell-post-command-hook): (void-function find-function-library)
From the beggining of guess-language.el
, I assume that the function find-function-library
should be found in find-func
but no such function is present there in my case. However a function find-library
is present.
Since I use wcheck-mode instead of flycheck I wish for supporting wcheck-mode.
I switched from flyspell to wcheck-mode because it uses vastly less ressources.
Maybe add 'advantage of using local variables' in readme ?
e.g
;; Local Variables:
;; ispell-check-comments: exclusive
;; ispell-local-dictionary: "en"
%%% Local Variables:
%%% ispell-local-dictionary: "british"
%%% End:
Hi, I just noticed that the entry for Esperanto in guess-language-langcodes
reads:
(eo . ("eo" "English" "🟩" "Esperanto"))
This should of course be:
(eo . ("eo" "Esperanto" "🟩" "Esperanto"))
M-x flyspell-buffer
fails with error flyspell-large-region: Wrong type argument: stringp, nil
when guess-language-mode` is enabled. The major mode does not seem to be a factor to reproduce the error (it breaks in fundamental, text and org modes).
Steps to reproduce:
/tmp
$ cd /tmp/
$ git clone [email protected]:tmalsburg/guess-language.el.git
emacs -q
M-x eval-buffer
in the scratch
buffer):(add-to-list 'load-path "/tmp/guess-language.el")
(load "guess-language")
(setq guess-language-languages '(en fr pt))
(setq guess-language-min-paragraph-length 35)
(setq guess-language-langcodes
'((en . ("en_US" "English"))
(fr . ("fr_FR" "French"))
(pt . ("pt_BR" "Portuguese"))
))
The following lines must be in a language different from the language of the first line to cause the bug.
The number of lines below seems to be the minimum necessary to make the bug manifest itself in my system.
Uma linha longa o suficiente para que o problema se manifeste.
Uma linha longa o suficiente para que o problema se manifeste.
Uma linha longa o suficiente para que o problema se manifeste.
Uma linha longa o suficiente para que o problema se manifeste.
Uma linha longa o suficiente para que o problema se manifeste.
Uma linha longa o suficiente para que o problema se manifeste.
Uma linha longa o suficiente para que o problema se manifeste.
Uma linha longa o suficiente para que o problema se manifeste.
Uma linha longa o suficiente para que o problema se manifeste.
Uma linha longa o suficiente para que o problema se manifeste.
Uma linha longa o suficiente para que o problema se manifeste.
Uma linha longa o suficiente para que o problema se manifeste.
Uma linha longa o suficiente para que o problema se manifeste.
Uma linha longa o suficiente para que o problema se manifeste.
Uma linha longa o suficiente para que o problema se manifeste.
Uma linha longa o suficiente para que o problema se manifeste.
M-x flyspell-buffer
- Everything works as expectedM-x guess-language-mode
M-x flyspell-buffer
- Flyspell terminates with the following error before the end of the spell checking: flyspell-large-region: Wrong type argument: stringp, nil
Full output of the messages buffer:
For information about GNU Emacs and the GNU system, type C-h C-a.
Mark set
Loading /tmp/guess-language.el/guess-language.el (source)...done
You can run the command ‘eval-buffer’ with M-x ev-b RET
Loading /tmp/guess-language.el/guess-language.el (source)...done
(New file)
Mark set
Starting new Ispell process /usr/bin/aspell with default dictionary...
Checking region...
Spell Checking...100% [manifeste]
Spell Checking completed.
You can run the command ‘flyspell-buffer’ with M-x fl-bu RET
Spell Checking completed.
Guess-Language mode enabled in current buffer
You can run the command ‘guess-language-mode’ with M-x gue-mo RET
Guess-Language mode enabled in current buffer
Checking region...
Ispell process killed
Local Ispell dictionary set to pt_BR
Starting new Ispell process /usr/bin/aspell with pt_BR dictionary...
Detected language: Portuguese
flyspell-large-region: Wrong type argument: stringp, nil
Debugger output:
Debugger entered--Lisp error: (wrong-type-argument stringp nil)
flyspell-external-point-words()
flyspell-large-region(1 1221)
flyspell-region(1 1221)
flyspell-buffer()
funcall-interactively(flyspell-buffer)
call-interactively(flyspell-buffer record nil)
command-execute(flyspell-buffer record)
execute-extended-command(nil "flyspell-buffer" "flyspell-bu")
funcall-interactively(execute-extended-command nil "flyspell-buffer" "flyspell-bu")
call-interactively(execute-extended-command nil nil)
command-execute(execute-extended-command)
Emacs 26.1
guess-language from git (bc6fe11)
When opening a text file, I get the following error
Error in post-command-hook (flyspell-post-command-hook): (wrong-type-argument symbolp (quote typo-mode))
Looking in the code I found a call (bound-and-true-p 'typo-mode)
in the function guess-language-switch-typo-mode-function
. Given that bound-and-true-p
takes a symbol as argument this is obviously an error.
Removing the quote solves the problem for me
Hi,
when I create a commit from magit, I get:
Error in post-command-hook (flyspell-post-command-hook): (error "Undefined dictionary: en")
I've noticed that the cursor jumps around while text is being analysed by guess-language
. Although this is barely noticeable for small paragraphs, it becomes easier to see for larger ones. I've managed to produce this issue using a rather small Emacs configuration, which I store in ~/Desktop/tmp/init.el
:
;; ~/Desktop/tmp/init.el
(package-initialize)
(autoload 'flyspell-mode "flyspell" t)
(setq ispell-dictionary "british")
(setq ispell-current-dictionary "british")
(add-hook 'text-mode-hook 'flyspell-mode)
(require 'guess-language)
(setq guess-language-languages '(en da))
(setq guess-language-min-paragraph-length 35)
(add-hook 'text-mode-hook (lambda () (guess-language-mode 1)))
Now, open an empty file (e.g. ~/Desktop/tmp/Demo.txt
) in Emacs:
emacs -Q -l ~/Desktop/tmp/init.el ~/Desktop/tmp/Demo.txt
Initially, Emacs uses the british
dictionary.
First, I start typing a Danish sentence. Eventually the number of characters becomes 35, which triggers the guess-language
analysis. While the analysis is being performed, the cursors jumps between the "misspelled" words. Based on this analysis guess-language
correctly switches the dictionary to danish
.
Another way to make the cursor jump is by pasting a large chunk of text that contains several "misspelled" words (according to the current dictionary). For example by deleting the danish sentence, and pasting a large chunk of "Lorem Ipsum": since none of these words are included in the Danish dictionary this causes the cursor to jump a lot.
Below is a GIF that illustrates the scenario explained above:
I had to reduce the quality of the GIF - otherwise Github would not allow me to upload it. Let me know if the quality of the GIF is a problem.
System info:
GNU Emacs 26.0.50.2 (x86_64-pc-linux-gnu, GTK+ Version 3.18.9) of 2017-02-01
guess-language 20170204.311
I started using this package, and I'm very happy with it. The language detection mechanisms works very well. Thanks!
One feature that I'm missing though is to perform additional actions after the language detection mechanism has finished executing. Usually when I switch from English to Danish dictionary I also set the input method to danish-postfix
, which allows me to use the English (US) keyboard layout, while still being able to write Danish letters that are not available on the English keyboard layout. Setting the input method can be done like this:
M-x set-input-method RET danish-postfix RET
Now when I type "oe", Emacs changes it to "ø", "aa" becomes "å" and so on. The important thing, however, is that the keyboard layout remains the same.
Right now I have to change the input method manually every time guess-language
switches dictionary. It would be better if guess-language
could help me change the input method too. I tried to look for a hook that would allow me to run additional code when the dictionary changes (some ispell
hook), but I couldn't really find a clever way to do this. Since other users of guess-language
might want to perform a similar task I thought that adding a guess-language-after-autoset-hook
would be a good idea (perhaps you can think of a better name)? The hook is intended to be run immediately before guess-language-autoset
exits. What do you think?
I noticed that the dictionary names in guess-language-langcodes
sometimes use English names, sometimes native names, and sometimes just a two-letter language code. Cf.:
(cs . ("czech" "Czech" "🇨🇿" "Czech"))
(da . ("dansk" nil "🇩🇰" "Danish"))
(de . ("de" "German" "🇩🇪" "German"))
I ran into this when I noticed that Emacs couldn't spell-check Dutch. Turns out that aspell, which I use as back-end, doesn't recognise nederlands
, it only recognised dutch
.
I changed the entry in guess-language-langcodes
(it's a user option, after all) but it made me wonder if it might make sense to add multiple entries for some languages? If aspell knows the language as dutch
, but ispell or hunspell uses nederlands
, wouldn't it be better to have entries for both?
Or is this something that should be handled by Emacs' ispell
module?
Is it possible to add support for flycheck-prog-mode
?
After some recent changes (d2a1330) guess-language depends on advice 0.1
package which is not on melpa, and consequently guess-language fails to build.
I was wondering what the rationale is behind the run-at-time
at https://github.com/tmalsburg/guess-language.el/blob/master/guess-language.el#L247 ? I'm getting a lot of "Blocking call to accept-process-output with quit inhibited" as a result of this. This isn't so bad, but Emacs would also occasionally hang because two of those run-at-time
at the same time seem to cause problems (?)
Just calling the body of that function locally seems to work, but I didn't extensively test it.
Hi,
just upgraded guess-language
from MELPA. I have this snippet in my init file:
(use-package guess-language ; Automatically detect language for Flyspell
:ensure t
:defer t
:init (add-hook 'text-mode-hook #'guess-language-mode)
:config
(validate-setq guess-language-langcodes '((en . ("en_GB" "English"))
(it . ("it_IT" "Italian")))
guess-language-languages '(en it)
guess-language-min-paragraph-length 45)
:diminish guess-language-mode)
I get this warning upon restarting Emacs:
Error (use-package): guess-language :config: Looking for `(alist :key-type symbol :value-type ...)' in `((en "en_GB" "English") (it "it_IT" "Italian"))' failed because:
Looking for `(repeat (cons symbol list))' in `((en "en_GB" "English") (it "it_IT" "Italian"))' failed because:
Looking for `(cons symbol list)' in `(en "en_GB" "English")' failed because:
Looking for `list' in `("en_GB" "English")' failed because:
wrong number of elements
I am using validate-setq
from @Malabarba's validate to be sure I am setting the right values everywhere in my init file.
Everything's working fine with regular setq
, so this is might not be related to your package.
I am using GNU Emacs 30.0.60 and installed guess-language.el
version 0.0.1 from gnu elpa
package. When I call guess-language
from a text buffer, the following error happens:
guess-language-compile-regexps: Opening input file: Aucun fichier ou dossier de ce type, /home/matthias/.config/emacs/elpa/guess-language-0.0.1/trigrams/en
Which is not surprising:
matthias@peitho:~$ tree .config/emacs/elpa/guess-language-0.0.1/
.config/emacs/elpa/guess-language-0.0.1/
├── guess-language-autoloads.el
├── guess-language.el
├── guess-language.elc
└── guess-language-pkg.el
1 directory, 4 files
No problem with version from melpa
:
matthias@peitho:~$ ls -l .config/emacs/elpa/guess-language-20240528.1319/ | wc -l
73
Some modes post content in the buffer that is read-only and not related to the actual content that the user writes.
For example in have or ERC/Circe the messages sent by other users or for Mastodon the same and the keybinds shown.
Hi,
I've been experiencing a terrible slow-down in Org buffers recently, especially when inside tables, but also just moving the cursor around partially collapsed headings. A quick profiling showed that guess-language
is the culprit, especially the call to how-many
in guess-language-region
:
- flyspell-post-command-hook 2958 89%
- flyspell-word 2958 89%
- flyspell-highlight-duplicate-region 2940 88%
- run-hook-with-args-until-success 2940 88%
- guess-language-function 2940 88%
- guess-language 2940 88%
- guess-language-region 2883 87%
how-many 2883 87%
+ backward-paragraph 49 1%
+ forward-paragraph 8 0%
+ org-mode-flyspell-verify 18 0%
+ command-execute 323 9%
+ yas--post-command-handler 14 0%
+ redisplay_internal (C function) 9 0%
+ timer-event-handler 3 0%
+ ... 0 0%
It seems that the longer the Org file, the bigger the slow down. I'm guessing that this may be caused by the fact that backward-paragraph
in an Org buffer may travel very far back: in one particular Org file of mine, it moves almost all the way to the beginning of the buffer.
I tried the obvious thing, i.e., use org-backward-paragraph
and org-forward-paragraph
in guess-language-paragraph
if major-mode
is org-mode
, but that didn't seem to have much of an effect. Perhaps you know of a better way to deal with the issue?
Hello!
Verison 0.0.1
of package guess-language
in GNU ELPA archive is very old and have critical bug: recipe do not contain files from trigrams/
directory.
When guess-language-mode
is activated, Emacs show error message:
Error in post-command-hook (flyspell-post-command-hook): (file-missing "Opening input file" "File or folder does not exists" "/home/dunaevsky/.emacs.d/elpa/guess-language-0.0.1/trigrams/en")
Please, update recipe for GNU ELPA archive.
I was looking for a package that would enable me to automatically switch the dictionary of my spell-checker. In addition to this package, I also found auto-dictionary-mode
.
I think it would be helpful to add a small section in the README
to explain the most important differences between this package and similar ones. It seems to me like this package and auto-dictionary-mode
are trying to achieve the same goal. I'm interested in knowing what I gain by using this package in particular (rather than auto-dictionary-mode
). Although auto-dictionary-mode
does not seem to be actively maintained, it is quite popular (based on its number of downloads). Also, can you say anything about the performance of these two packages?
Thanks in advance.
hi, i'm interested in using this just for the guess-language part only (i.e. not the typo-mode setting or spellchecking) but using all possible languages.
is it possible that there's no japanese (ja), chinese (zh), and korean (ko) in the trigrams data? or am i confused about it somehow?
i did a few tests with chinese and japanese texts and guess-language-region
returned zu, i.e. Zulu.
but i must be a little confused, as guess_language.py supports those languages, but it doesn't have ja, zh, or ko in its trigrams files.
perhaps the python package simply selects those languages (and greek) by their script, using the Blocks.txt
file? would it be possible to support that also in guess-language.el?
i guess if that's the issue i'm encountering it would require a bit of work to support those languages in this package...
Hi there,
I've been experiencing some problems using guess-language and org-mode.
M-x guess-language in an org-file anywhere on the second line of the example below causes the error:
Wrong type argument: number-or-marker-p, nil
.
- First item. I'm just writing something longer so that the minimum
number of characters is reached.
This problem also shows up (not always) when fly-spelling the whole buffer (M-x flyspell-buffer) stopping the verification before it reaches the end of the file.
My guess this error can be somewhat related to the (last part) of issue #17 (comment) and also commit 65dccb1 which deals with paragraph navigation in org-mode files.
Using:
Emacs 26.1
Org mode 9.1.14
guess-language head from git repository.
Debugger output:
Debugger entered--Lisp error: (wrong-type-argument number-or-marker-p nil)
org-list-struct()
guess-language-forward-paragraph()
guess-language()
funcall-interactively(guess-language)
call-interactively(guess-language record nil)
command-execute(guess-language record)
helm-M-x(nil "guess-language")
funcall-interactively(helm-M-x nil "guess-language")
call-interactively(helm-M-x nil nil)
command-execute(helm-M-x)
Please let me know if you need further info to pinpoint the error.
Hi, this looks like it's exactly the thing I've been looking for, but I'm having trouble setting it up. I installed the package and added the following to my init file:
(use-package guess-language
:ensure t
:init
(add-hook 'text-mode-hook #'guess-language-mode)
:diminish guess-language-mode)
But when I open a text file, flyspell
doesn't work (even though I get a message about it starting up). The *Messages*
buffer contains a message such as the following:
Error in post-command-hook (flyspell-post-command-hook): (file-error "Opening input file" "No such file or directory" "/home/joost/src/criticmarkup-emacs/trigrams/en")
The directory in the error message is the one containing the file I just opened.
Am I doing something wrong or is this a bug somewhere?
I wandered if there is possibility that you add short explanation about adding other languages. It just say now that it's "easy to add", but I have no idea how to do that.
I'm specifically interested in Serbian, but I believe this explanation could be made general and of value to users of many other languages.
Thanks for the great package!
Emacs -Q:
(progn
(setq package-archives '(("gnu" . "https://elpa.gnu.org/packages/")
("nongnu" . "https://elpa.nongnu.org/nongnu/")))
(package-install 'magit)
(package-install 'guess-language)
(package-activate-all)
(setq guess-language-languages '(en sl))
(setq-default ispell-dictionary "slovenian")
(add-hook 'text-mode-hook 'flyspell-mode)
(add-hook 'flyspell-mode-hook 'guess-language-mode))
Each M-x magit-commit-create
will leave an extra aspell process.
(A bit more info can be found on
http://debbugs.gnu.org/cgi/bugreport.cgi?bug=48379 )
Line 168 should have:
(boundp 'typo-mode)
not
(boundp typo-mode)
Country flags are not appropriate for some languages in which case we might want to show some other fancy unicode character. Motivation: The current language is difficult to spot in busy mode lines.
As far as I understand, Emacs currently lacks the necessary support for combined glyphs like 🇩🇪 (1F1E9 and 1F1EA unified), but in the future this will likely be possible.
Edit: Combined glyphs and native display of color emoji are supported as of version 28.1.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.