clj-det-enc is a encoding detector using juniversalchardet java library.
(use 'hozumi.det-enc)
Usage: (detect target)
(detect "utf8.txt")
=> "UTF-8"
(detect "unknown.txt")
=> nil
Usage: (detect target encodingname-when-unknown)
(detect "unknown.txt" "EUC-JP")
=> "EUC-JP"
(detect "unknown.txt" :default)
=> "SHIFT_JIS"
return:
encoding name or nil when target encoding cannot be detected.
target:
Whatever clojure.java.io/input-stream can deal with.
(File, filename(String), InputStream, BufferedStream etc)
Target stream is closed automatically.
encodingname-when-unknown:
Return this value when target encoding cannot be detected.
- :default means the default charset of your Java virtual machine.
What encodings can be detected? See juniversalchardet
leiningen
[org.clojars.hozumi/clj-det-enc "1.0.0-SNAPSHOT"]