Giter Site home page Giter Site logo

konstare / gdcv Goto Github PK

View Code? Open in Web Editor NEW
29.0 3.0 1.0 19.31 MB

gdcv - GoldenDict console version and emacs dynamic module

License: GNU General Public License v3.0

Makefile 0.04% C 99.47% Emacs Lisp 0.49%
dictionary dsl emacs-modules goldendict lingvo cli emacs

gdcv's Introduction

gdcv - GoldenDict console version and emacs module

Description

gdcv is a command-line interface for searching GoldenDict dictionaries(*.dsl.dz). The program is a very rudimentary workaround to allow searching until an official command-line interface is available.

As an example of a similar interface, GoldenDict has gdcl (Command-line interface for Goldendict), but for my dictionary collection(10 GB) it works really slow due to no pre-made index for faster searching.

StarDict has sdcv (StarDict Console Version), but it can only handle dictionaries in the StarDict format. For users of GoldenDict who have large collections of dictionaries in DSL format, converting and maintaining two parallel sets of dictionaries is not a practical solution.

gdcv does not require an installation of GoldenDict and does not use GoldenDict’s pre-made index files, it creates new one.

The main catch is that gdcv is written in C by a person who has never programmed anything before. It seems the program does not leak memory, but the software architecture is horrible.

Please vote for official command-line interface for GoldenDict.

Features

  • Search a word in a dictionary
  • Find words that contain a substring
  • Syntax highlighting
  • Extracting media files
  • Emacs dynamic module, if necessary

Installation

Requirements

  1. gcc or clang
  2. zlib
  3. unzip (in order to unpack resources in zip files)

For Ubuntu:

sudo apt install zlib1g-dev unzip

Compilation

git clone https://github.com/konstare/gdcv
cd gdcv
make gdcv

Put gdcv into a directory from the $PATH.

Configuration

Convert dictionaries from UTF-16 to UTF-8

Sometimes GoldenDict dictionaries (DSL) comes in UTF-16. It is above my pay-grade to deal with it, all dictionaries should be converted to UTF-8. It is not the problem for GoldenDict as it supports both UTF-16 and UTF-8. Moreover dictionaries in UTF-8 take 20% less space

To convert dictionaries dos2unix is needed.

In the GoldenDict directory:

find . -name '*.dsl.dz' -exec  gunzip -S .dz {}   \;
find . -name '*.dsl' -exec  dos2unix --add-bom -f {} \;
find . -name '*.dsl' -exec  gdcv -z {} \;
find . -name '*.ann' -exec  dos2unix --add-bom -f {} \;

Create Index

gdcv -i /path/to/directory/in/DSL/format /path/to/second/directory/in/DSL/format

Two index files will be saved in the directory: $XDG_CONFIG_HOME/gdcv/ or $HOME/.config/gdcv/ if XDG_CONFIG_HOME is not defined.

Usage

To look up a word “cat”

gdcv cat

./video/cli.gif

To search words containing “cat”

gdcv *cat

Put all additional media files to directory /tmp

gdcv cat --unzip=/tmp/

Create dictionary

It is easy to create a dictionary in GoldenDict format. Essentially the format is specified here, except that @ tag is not supported. One can use typical dictionary in DSL format. The dsl file has to be compressed in dictzip format and moved to GoldenDict directory.

gdcv -z dictionary.dsl

Note, that the result dsl.dz file can be uncompressed back with gzip:

gunzip -S .dz dictionary.dsl.dz

Emacs module

For many years I have successfully used sdcv-mode and sdcv in my work flow. Turn out all modern dictionaries are formatted in GoldenDict format (DSL). I tried to convert DSL to StarDict format with pyglossary but the result was mediocre. There is goldendict-el for Emacs but I wanted something similar to sdcv-mode.

To install gdcv-mode

compile and create index files.

make gdcv emacs-module
gdcv -i /path/to/directory/in/DSL/format

copy gdcv-elisp.so and gdcv.el to load-path. For example:

cp gdcv-elisp.so ~/.emacs.d/site-lisp/
cp gdcv.el ~/.emacs.d/site-lisp/

Configuration

Add to the init file

(use-package gdcv
  :load-path "~/.emacs.d/site-lisp"
  :bind (("C-c d" . gdcv-search-word)))

If the index file is not saved in default directory, add:

(setq gdcv-index-path "path/to/index/file")

To show the selected dictionary first, modify gdcv-default-dictionary-list

(setq gdcv-default-dictionary-list '("OxfordDictionary (En-En)" "Merriam-Webster's Advanced Learner's Dictionary (En-En)"))

All media files for the translated word are unpacked to gdcv-media-temp-directory and are played by gdcv-play-media function (by default it is just wrapper around xdg-open).

 (setq gdcv-media-temp-directory "/tmp/gdcv/"
	gdcv-play-media (lambda (file) 
			  (let ((process-connection-type nil))
			    (start-process "" nil  "xdg-open"  file))))

Usage

C-c d to translate word (or text selection) under the cursor.

./video/emacs.gif

The gdcv-mode goes with simple ivy interface ivy-gdcv, which can be used to search a word. By default, the prefix search is used, for example for “cat”, one can get: “cat”,”catamaran”, “cater”… For the substring search one can type “*cat” and get: “cat”,”muscatel”,…

./video/ivy.gif

Useful links

Examples of dictionaries in DSL

DSL format specification:

http://lingvo.helpmax.net/en/troubleshooting/dsl-compiler/your-first-dsl-dictionary/

Typical dictionary in DSL format

https://github.com/Tvangeste/SampleDSL

Tools for creating DSL-format dictionaries

https://github.com/dohliam/dsl-tools

Command-line interface for Goldendict dictionaries written in ruby

https://github.com/dohliam/gdcl

Lingvo dictionaries decompiler

A tool for converting dictionary files aka glossaries with various formats for different dictionary applications

https://github.com/ilius/pyglossary

gdcv's People

Contributors

konstare avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

pallas42

gdcv's Issues

Any support for EPWING dictionaries?

I make extensive use of EPWING dictionaries in my translation work - I wonder what might be needed to support that format (as Goldendict itself already does).

Perhaps I could contribute something

Does it work in Windows?

Does it work in Windows? When I compile, this comes up:
$ make gdcv
gcc -o gdcv gdcv.c dictzip.c utils.c format.c index.c utfproc/utf8proc.c -Wall -Wextra -pedantic -LC:/msys64/mingw64/lib -lz -O2 -march=native
gdcv.c:3:10: fatal error: argp.h: No such file or directory
3 | #include <argp.h> //argp
| ^~~~~~~~
compilation terminated.

segfault when indexing several directories

I'm trying to index several directories like this:

❯ ~/work_self/gdcv/gdcv -i En-En En-ru Ru-En
[1]    3636538 segmentation fault (core dumped)  ~/work_self/gdcv/gdcv -i En-En En-ru Ru-En

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.