Giter Site home page Giter Site logo

kcws's Introduction

引用 

  本项目模型BiLSTM+CRF参考论文:http://www.aclweb.org/anthology/N16-1030 ,IDCNN+CRF参考论文:https://arxiv.org/abs/1702.02098

构建

  1. 安装好bazel代码构建工具,安装好tensorflow(目前本项目需要tf 1.0.0alpha版本以上)

  2. 切换到本项目代码目录,运行./configure

  3. 编译后台服务

    bazel build //kcws/cc:seg_backend_api

训练

  1. 关注待字闺中公众号 回复 kcws 获取语料下载地址:

    logo

  2. 解压语料到一个目录

  3. 切换到代码目录,运行:

python kcws/train/process_anno_file.py <语料目录> pre_chars_for_w2v.txt

bazel build third_party/word2vec:word2vec

先得到初步词表

./bazel-bin/third_party/word2vec/word2vec -train pre_chars_for_w2v.txt -save-vocab pre_vocab.txt -min-count 3

处理低频词   python kcws/train/replace_unk.py pre_vocab.txt pre_chars_for_w2v.txt chars_for_w2v.txt

训练word2vec

./bazel-bin/third_party/word2vec/word2vec -train chars_for_w2v.txt -output vec.txt -size 50 -sample 1e-4 -negative 5 -hs 1 -binary 0 -iter 5

构建训练语料工具

bazel build kcws/train:generate_training

生成语料

./bazel-bin/kcws/train/generate_training vec.txt <语料目录> all.txt

得到train.txt , test.txt文件

python kcws/train/filter_sentence.py all.txt

  1. 安装好tensorflow,切换到kcws代码目录,运行:

python kcws/train/train_cws.py --word2vec_path vec.txt --train_data_path <绝对路径到train.txt> --test_data_path test.txt --max_sentence_len 80 --learning_rate 0.001  (默认使用IDCNN模型,可设置参数”--use_idcnn False“来切换BiLSTM模型)

  1. 生成vocab

bazel build kcws/cc:dump_vocab

./bazel-bin/kcws/cc/dump_vocab vec.txt kcws/models/basic_vocab.txt

  1. 导出训练好的模型

python tools/freeze_graph.py --input_graph logs/graph.pbtxt --input_checkpoint logs/model.ckpt --output_node_names "transitions,Reshape_7" --output_graph kcws/models/seg_model.pbtxt

  1. 词性标注模型下载 (临时方案,后续文档给出词性标注模型训练,导出等)

    https://pan.baidu.com/s/1bYmABk 下载pos_model.pbtxt到kcws/models/目录下

  2. 运行web service

./bazel-bin/kcws/cc/seg_backend_api --model_path=kcws/models/seg_model.pbtxt(绝对路径到seg_model.pbtxt>) --vocab_path=kcws/models/basic_vocab.txt --max_sentence_len=80

词性标注的训练说明:

https://github.com/koth/kcws/blob/master/pos_train.md

自定义词典

目前支持自定义词典是在解码阶段,参考具体使用方式请参考kcws/cc/test_seg.cc 字典为文本格式,每一行格式如下:

<自定义词条>\t<权重>

比如:

蓝瘦香菇 4

权重为一个正整数,一般4以上,越大越重要

demo

http://45.32.100.248:9090/

附: 使用相同模型训练的公司名识别demo:

http://45.32.100.248:18080

kcws's People

Contributors

koth avatar vsooda avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kcws's Issues

error when run bazel build //kcws/cc:seg_backend_api

Can you tell me how can i fix it? Thank you!

ERROR: Failed to load Skylark extension '@io_bazel_rules_closure//closure/private:java_import_external.bzl'.
It usually happens when the repository is not defined prior to being used.
Maybe repository 'io_bazel_rules_closure' was defined later in your WORKSPACE file?
ERROR: cycles detected during target parsing.

关于默认分词的效果

我按照说明操作后,分词的效果如下。分词效果不是很准,下面是分词结果,这个正常吗?
{
"msg": "OK",
"segments": [
"赵雅",
"淇",
"洒泪",
"道",
"歉",
" ",
"和林",
"丹",
"没",
"有",
"任",
"何",
"经济",
"关",
"系"
],
"status": 0
}

error when run bazel build //kcws/cc:seg_backend_api

ERROR: com.google.devtools.build.lib.packages.BuildFileContainsErrorsException: error loading package '': Encountered error while reading extension file 'tensorflow/workspace.bzl': no such package '@org_tensorflow//tensorflow': local_repository rule //external:org_tensorflow must specify an existing directory.
INFO: Elapsed time: 0.049s

build on:
centos6.8 x64
no gpu support
Build label: 0.4.1- (@non-git)
tensorflow-0.11.0

编译后台服务的命令执行错误

Hao:kcws Hao$ bazel build //kcws/cc:seg_backend_api
WARNING: /private/var/tmp/_bazel_Hao/24147b300412080e7db896e2e916763e/external/org_tensorflow/tensorflow/workspace.bzl:13:5: path_prefix was specified to tf_workspace but is no longer used and will be removed in the future.
WARNING: /private/var/tmp/_bazel_Hao/24147b300412080e7db896e2e916763e/external/org_tensorflow/tensorflow/workspace.bzl:15:5: tf_repo_name was specified to tf_workspace but is no longer used and will be removed in the future.
ERROR: /Users/xxx/kcws/third_party/crow/BUILD:5:1: no such package '@boost//': Error downloading from https://sourceforge.net/projects/boost/files/boost/1.61.0/boost_1_61_0.tar.bz2/download to /private/var/tmp/bazel_Hao/24147b300412080e7db896e2e916763e/external/boost: Error downloading https://sourceforge.net/projects/boost/files/boost/1.61.0/boost1610.tar.bz2/download to /private/var/tmp/bazel_Hao/24147b300412080e7db896e2e916763e/external/boost/download.tar.bz2: Failed to connect to https://superb-sea2.dl.sourceforge.net/project/boost/boost/1.61.0/boost1610.tar.bz2 : sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target and referenced by '//third_party/crow:crow'.
ERROR: Analysis of target '//kcws/cc:seg_backend_api' failed; build aborted.
INFO: Elapsed time: 4.937s

embedding_size AssertionError

在最后train的时候:也就是运行:
python kcws/train/train_cws_lstm.py --word2vec_path vec.txt --train_data_path <绝对路径到train.txt> --test_data_path test.txt --max_sentence_len 80 --learning_rate 0.001

报错:
Traceback (most recent call last):
File "kcws/train/train_cws_lstm.py", line 262, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "kcws/train/train_cws_lstm.py", line 228, in main
FLAGS.word2vec_path, FLAGS.num_hidden)
File "kcws/train/train_cws_lstm.py", line 62, in init
self.c2v = self.load_w2v(c2vPath)
File "kcws/train/train_cws_lstm.py", line 132, in load_w2v
assert (dim == (FLAGS.embedding_size))
AssertionError

然后修改了:train_cws_lstm.py 的
tf.app.flags.DEFINE_integer("embedding_size", 50, "embedding size")

tf.app.flags.DEFINE_integer("embedding_size", 200, "embedding size") 就好

运行web service时出错: tfmodel.cc:87 Session was not created with a graph before Run()!

依照readme成功跑完之前的步骤,在最后一步运行web service的时候报错:

I0207 10:51:45.643787 386 tfmodel.cc:64] Reading file to proto: kcws/models/seg_model.pbtxt
I0207 10:51:45.648360 386 tfmodel.cc:69] Creating session.
I0207 10:51:45.675081 386 tfmodel.cc:77] Tensorflow graph loaded from: kcws/models/seg_model.pbtxt
I kcws/cc/tf_seg_model.cc:236] Reading from layer transitions
I kcws/cc/tf_seg_model.cc:242] got num tag:4
I kcws/cc/tf_seg_model.cc:255] Total word :5860
I0207 10:51:45.694744 386 tfmodel.cc:64] Reading file to proto: kcws/models/pos_model.pbtxt
I0207 10:51:45.694774 386 tfmodel.cc:69] Creating session.
I0207 10:51:45.694783 386 tfmodel.cc:77] Tensorflow graph loaded from: kcws/models/pos_model.pbtxt
E0207 10:51:46.211813 386 tfmodel.cc:87] Error during inference: Invalid argument: Session was not created with a graph before Run()!
E kcws/cc/pos_tagger.cc:215] Error during get trans tensors:
F0207 10:51:46.211869 386 seg_backend_api.cc:57] Check failed: tagger->LoadModel(FLAGS_pos_model_path, FLAGS_word_vocab_path, FLAGS_vocab_path, FLAGS_pos_vocab_path, FLAGS_max_word_num) load pos model error
*** Check failure stack trace: ***
Aborted (core dumped)

Typo of train file name in readme

输入文件应该和之前一步的输出的文件名字一致

--- a/README.md
+++ b/README.md
@@ -30,7 +30,7 @@
   
   > 使用word2vec 训练 chars_for_w2v (注意-binary 0),得到字嵌入结果vec.txt
   
-  > ./bazel-bin/third_party/word2vec/word2vec -train chars_for_vec.txt -output kcws
/models/vec.txt -size 50 -sample 1e-4 -negative 5 -hs 1 -binary 0 -iter 5
+  > ./bazel-bin/third_party/word2vec/word2vec -train chars_for_w2v.txt -output kcws/models/vec.txt -size 50 -sample 1e-4 -negative 5 -hs 1 -binary 0 -iter 5
  
   
   > bazel build kcws/train:generate_training 

ERROR: Analysis of target '//kcws/cc:seg_backend_api' failed; build aborted.

Hi, when I build the kcws, there are some issues, how can I fix them?

the issues are as follow:

[root@bio-x-2 cc]# /opt/BioDir/dl/bazel-0.4.3/output/bazel build //kcws/cc:seg_backend_api
WARNING: Sandboxed execution is not supported on your system and thus hermeticity of actions cannot be guaranteed. See http://bazel.build/docs/bazel-user-manual.html#sandboxing for more information. You can turn off this warning via --ignore_unsupported_sandboxing.
WARNING: /root/.cache/bazel/_bazel_root/067d099fd5fd2abf4236febace697e72/external/org_tensorflow/tensorflow/workspace.bzl:13:5: path_prefix was specified to tf_workspace but is no longer used and will be removed in the future.
WARNING: /root/.cache/bazel/_bazel_root/067d099fd5fd2abf4236febace697e72/external/org_tensorflow/tensorflow/workspace.bzl:15:5: tf_repo_name was specified to tf_workspace but is no longer used and will be removed in the future.
ERROR: /root/.cache/bazel/_bazel_root/067d099fd5fd2abf4236febace697e72/external/org_tensorflow/tensorflow/core/platform/default/build_config/BUILD:108:1: error loading package '@jpeg//': Extension file not found. Unable to load package for '//third_party:common.bzl': BUILD file not found on package path and referenced by '@org_tensorflow//tensorflow/core/platform/default/build_config:jpeg'.
ERROR: Analysis of target '//kcws/cc:seg_backend_api' failed; build aborted.
INFO: Elapsed time: 2.612s

=================
I build the bazel tools as follow:

[root@bio-x-2 bazel-0.4.3]# bash ./compile.sh
INFO: You can skip this first step by providing a path to the bazel binary as second argument:
INFO: ./compile.sh compile /path/to/bazel
 Building Bazel from scratch.......
 Building Bazel with Bazel.
.WARNING: /tmp/bazel_lAI1U4my/out/external/bazel_tools/WORKSPACE:1: Workspace name in /tmp/bazel_lAI1U4my/out/external/bazel_tools/WORKSPACE (@io_bazel) does not match the name given in the repository's definition (@bazel_tools); this will cause a build error in future versions.
INFO: Found 1 target...
INFO: From Compiling third_party/ijar/platform_utils.cc [for host]:
third_party/ijar/platform_utils.cc: In function 'bool devtools_ijar::write_file(const char*, mode_t, const void*, size_t)':
third_party/ijar/platform_utils.cc:67:32: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (write(fd, data, size) != size) {
^
INFO: From Compiling third_party/ijar/platform_utils.cc:
third_party/ijar/platform_utils.cc: In function 'bool devtools_ijar::write_file(const char*, mode_t, const void*, size_t)':
third_party/ijar/platform_utils.cc:67:32: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (write(fd, data, size) != size) {
^
INFO: From Compiling third_party/ijar/ijar.cc:
third_party/ijar/ijar.cc: In member function 'virtual bool devtools_ijar::JarStripperProcessor::Accept(const char*, devtools_ijar::u4)':
third_party/ijar/ijar.cc:66:23: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (filename_len >= CLASS_EXTENSION_LENGTH) {
^
INFO: From Compiling third_party/ijar/ijar.cc [for host]:
third_party/ijar/ijar.cc: In member function 'virtual bool devtools_ijar::JarStripperProcessor::Accept(const char*, devtools_ijar::u4)':
third_party/ijar/ijar.cc:66:23: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (filename_len >= CLASS_EXTENSION_LENGTH) {
^
INFO: From Compiling src/main/cpp/blaze_util_posix.cc:
src/main/cpp/blaze_util_posix.cc: In function 'void blaze::Daemonize(const string&)':
src/main/cpp/blaze_util_posix.cc:190:28: warning: ignoring return value of 'int dup(int)', declared with attribute warn_unused_result [-Wunused-result]
(void) dup(STDOUT_FILENO); // stderr (2>&1)
^
src/main/cpp/blaze_util_posix.cc: In function 'uint64_t blaze::AcquireLock(const string&, bool, bool, blaze::BlazeLock*)':
src/main/cpp/blaze_util_posix.cc:578:30: warning: ignoring return value of 'int ftruncate(int, __off_t)', declared with attribute warn_unused_result [-Wunused-result]
(void) ftruncate(lockfd, 0);
^
src/main/cpp/blaze_util_posix.cc:583:47: warning: ignoring return value of 'ssize_t write(int, const void*, size_t)', declared with attribute warn_unused_result [-Wunused-result]
(void) write(lockfd, msg.data(), msg.size());
^
INFO: From JavacBootstrap src/java_tools/buildjar/java/com/google/devtools/build/buildjar/libbootstrap_JarOwner.jar [for host]:
warning: Implicitly compiled files were not subject to annotation processing.
Use -proc:none to disable annotation processing or -implicit to specify a policy for implicit compilation.
1 warning
INFO: From Building src/main/protobuf/libextra_actions_base_java_proto.jar (1 source jar):
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
INFO: From Building src/java_tools/junitrunner/java/com/google/testing/coverage/JacocoCoverage.jar (9 source files):
Note: src/java_tools/junitrunner/java/com/google/testing/coverage/MethodProbesMapper.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
INFO: From Building src/tools/android/java/com/google/devtools/build/android/ziputils/libziputils_lib.jar (12 source files):
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
INFO: From Building src/main/java/com/google/devtools/build/lib/libconcurrent.jar (18 source files):
Note: src/main/java/com/google/devtools/build/lib/concurrent/AbstractQueueVisitor.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
INFO: From Building third_party/java/apkbuilder/apkbuilder.jar (15 source files):
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
INFO: From Building src/main/java/com/google/devtools/build/lib/libutil.jar (45 source files):
Note: src/main/java/com/google/devtools/build/lib/util/OrderedSetMultimap.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
INFO: From Building src/main/java/com/google/devtools/build/lib/cmdline/libcmdline.jar (10 source files):
Note: src/main/java/com/google/devtools/build/lib/cmdline/RepositoryName.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
INFO: From Building src/main/java/com/google/devtools/build/skyframe/libskyframe.jar (67 source files):
Note: src/main/java/com/google/devtools/build/skyframe/ReverseDepsUtilImpl.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
INFO: From Building src/main/java/com/google/devtools/build/lib/libsyntax.jar (86 source files):
Note: src/main/java/com/google/devtools/build/lib/syntax/BuiltinFunction.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
INFO: From Building src/main/java/com/google/devtools/build/lib/libpackages-internal.jar (98 source files):
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
INFO: From Building src/main/java/com/google/devtools/build/lib/actions/libactions.jar (91 source files):
Note: src/main/java/com/google/devtools/build/lib/actions/Actions.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
INFO: From Building src/main/java/com/google/devtools/build/lib/libbuild-base.jar (381 source files):
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
INFO: From Building src/main/java/com/google/devtools/build/lib/libproto-rules.jar (13 source files):
Note: src/main/java/com/google/devtools/build/lib/rules/proto/ProtoCommon.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
INFO: From Building src/main/java/com/google/devtools/build/lib/query2/libquery2.jar (12 source files):
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
INFO: From Building src/main/java/com/google/devtools/build/lib/query2/libquery-output.jar (10 source files):
Note: src/main/java/com/google/devtools/build/lib/query2/output/QueryOutputUtils.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
INFO: From Building src/main/java/com/google/devtools/build/lib/rules/genquery/libgenquery.jar (2 source files):
Note: src/main/java/com/google/devtools/build/lib/rules/genquery/GenQuery.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
INFO: From Building src/main/java/com/google/devtools/build/lib/rules/cpp/libcpp.jar (80 source files):
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
INFO: From Building src/main/java/com/google/devtools/build/lib/libpython-rules.jar (15 source files):
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
INFO: From Building src/main/java/com/google/devtools/build/lib/libjava-compilation.jar (37 source files):
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: src/main/java/com/google/devtools/build/lib/rules/java/JavaCompileAction.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
INFO: From Building src/main/java/com/google/devtools/build/lib/libjava-rules.jar (32 source files):
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
INFO: From Building src/main/java/com/google/devtools/build/lib/libandroid-rules.jar (59 source files):
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
INFO: From Building src/main/java/com/google/devtools/build/lib/libideinfo.jar (4 source files):
Note: src/main/java/com/google/devtools/build/lib/ideinfo/AndroidStudioInfoAspect.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
INFO: From Building src/main/java/com/google/devtools/build/lib/rules/objc/libobjc.jar (114 source files):
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: src/main/java/com/google/devtools/build/lib/rules/objc/IterableWrapper.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
INFO: From Building src/main/java/com/google/devtools/build/lib/libruntime.jar (94 source files):
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
INFO: From Building src/main/java/com/google/devtools/build/lib/sandbox/libsandbox.jar (16 source files):
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
INFO: From Building src/main/java/com/google/devtools/build/lib/worker/libworker.jar (11 source files):
Note: src/main/java/com/google/devtools/build/lib/worker/WorkerSpawnStrategy.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
INFO: From Building src/main/java/com/google/devtools/build/lib/libbazel-rules.jar (87 source files, 14 resources):
Note: src/main/java/com/google/devtools/build/lib/bazel/rules/java/BazelJavaSemantics.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Target //src:bazel up-to-date:
bazel-bin/src/bazel
INFO: Elapsed time: 178.725s, Critical Path: 170.17s
WARNING: /tmp/bazel_lAI1U4my/out/external/bazel_tools/WORKSPACE:1: Workspace name in /tmp/bazel_lAI1U4my/out/external/bazel_tools/WORKSPACE (@io_bazel) does not match the name given in the repository's definition (@bazel_tools); this will cause a build error in future versions.

Build successful! Binary is here: /opt/BioDir/dl/bazel-0.4.3/output/bazel

自己训练的跟官方的分词不一致

我训练的用的语料是待字闺中公众号里下载的语料。

对demo里的“赵雅淇洒泪道歉 和林丹没有任何经济关系”分词后的结果是:
{
"msg": "OK",
"segments": [
"赵雅",
"淇",
"洒泪",
"道",
"歉",
" ",
"和林",
"丹",
"没",
"有",
"任",
"何",
"经济",
"关",
"系"
],
"status": 0
}

请问下,这是为什么?

我训练的语句如下:
cd kcws

python kcws/train/process_anno_file.py /usr/local/people2014 pre_chars_for_w2v.txt

bazel build third_party/word2vec:word2vec

./bazel-bin/third_party/word2vec/word2vec -train pre_chars_for_w2v.txt -save-vocab pre_vocab.txt -min-count 3

python kcws/train/replace_unk.py pre_vocab.txt pre_chars_for_w2v.txt chars_for_w2v.txt

./bazel-bin/third_party/word2vec/word2vec -train chars_for_w2v.txt -output kcws/models/vec.txt -size 50 -sample 1e-4 -negative 5 -hs 1 -binary 0 -iter 5

bazel build kcws/train:generate_training

./bazel-bin/kcws/train/generate_training kcws/models/vec.txt /usr/local/people2014 all.txt

python kcws/train/filter_sentence.py all.txt

python kcws/train/train_cws_lstm.py --word2vec_path kcws/models/vec.txt --train_data_path /usr/local/kcws/train.txt --test_data_path /usr/local/kcws/test.txt --max_sentence_len 80 --learning_rate 0.001

bazel build kcws/cc:dump_vocab
./bazel-bin/kcws/cc/dump_vocab kcws/models/vec.txt vocab.txt

./bazel-bin/kcws/cc/seg_backend_api --model_path=/usr/local/kcws/kcws/models/seg_model.pbtxt --vocab_path=/usr/local/kcws/vocab.txt --max_sentence_len=80

训练后显示的accuracy是96.61%。

运行web service失败

训练的1-5步都没有问题,第6步运行web service失败,失败信息如下:
open@ubuntu:/Documents/Project/kcws$ ./bazel-bin/kcws/cc/seg_backend_api --model_path=/Documents/Project/kcws/kcws/models/seg_model.pbtxt --vocab_path=~/Documents/Project/kcws/vocab.txt --max_sentence_len=80
I kcws/cc/tf_seg_model.cc:146] Loading Tensorflow.
I kcws/cc/tf_seg_model.cc:148] Making new SessionOptions.
I kcws/cc/tf_seg_model.cc:151] Got config, 0 devices
I kcws/cc/tf_seg_model.cc:154] Session created.
I kcws/cc/tf_seg_model.cc:156] Graph created.
I kcws/cc/tf_seg_model.cc:157] Reading file to proto: ~/Documents/Project/kcws/kcws/models/seg_model.pbtxt
I kcws/cc/tf_seg_model.cc:162] Creating session.
I kcws/cc/tf_seg_model.cc:177] End computing.
E kcws/cc/tf_seg_model.cc:180] Error during get trans tensors: Invalid argument: Session was not created with a graph before Run()!
WARNING: Logging before InitGoogleLogging() is written to STDERR
F1207 08:30:27.543987 10156 seg_backend_api.cc:41] Check failed: model.LoadModel(FLAGS_model_path, FLAGS_vocab_path, FLAGS_max_sentence_len) Load model error
*** Check failure stack trace: ***
已放弃 (核心已转储)

我想问问这是什么问题,怎么解决?

./bazel-bin/kcws/train/generate_training UTF-8 Error

'utf8' codec can't decode bytes in position 0-1: unexpected end of data
'utf8' codec can't decode bytes in position 0-1: unexpected end of data
'utf8' codec can't decode bytes in position 0-1: unexpected end of data
'utf8' codec can't decode bytes in position 0-1: unexpected end of data
'utf8' codec can't decode bytes in position 0-1: unexpected end of data
'utf8' codec can't decode bytes in position 0-1: unexpected end of data
'utf8' codec can't decode bytes in position 0-1: unexpected end of data
'utf8' codec can't decode bytes in position 0-1: unexpected end of data
'utf8' codec can't decode bytes in position 0-1: unexpected end of data^C
Traceback (most recent call last):
File "/home/ml/py27env/kcws/bazel-bin/kcws/train/generate_training.runfiles/main/kcws/train/generate_training.py", line 155, in
main(len(sys.argv), sys.argv)
File "/home/ml/py27env/kcws/bazel-bin/kcws/train/generate_training.runfiles/main/kcws/train/generate_training.py", line 147, in main
processLine(line, out, vob)
File "/home/ml/py27env/kcws/bazel-bin/kcws/train/generate_training.runfiles/main/kcws/train/generate_training.py", line 122, in processLine
print e
KeyboardInterrupt

word2vec输出结果格式

hi, 我看word2vec这一步没有脚本,自己写算的话,输出的结果vec.txt中,是每个词语对应一个向量吗

gflags link failed

Linking using thirdparty gflags failed.

Fixed by using self compiled gflags, maybe version issues of gflag.
Modification made to Build files.


--- a/third_party/glog/BUILD
+++ b/third_party/glog/BUILD
@@ -45,10 +45,7 @@ cc_library(
         "include/glog/stl_logging.h",
         "include/glog/vlog_is_on.h",
     ],
-    deps = [
-      "//third_party/gflags:gflags-cxx",
-
-    ],
+    linkopts = ["-lgflags"],
     hdrs = [
         "include/glog/logging.h",
     ],

关于标注部分的问题

大神好,我昨天仔细研究了您新添加的词性标注模块,然后我发现有几步好像有点问题,我自己尝试更改了一下,现在已经跑通了,99.57%的准确率,请您看看,问题如下:
1、在第五步骤,传入参数“lines_withpos.txt”,然而在代码里面并没有写入信息,我觉得应该得在代码里面添加 写入每个标注与其对应的序号。
2、在第六步骤,传入的第三个参数应该是上一步生成的词典“lines_withpos.txt”而不是”pos_vocab.txt“。

您看这样是正确的吗?

kcws/train/train_cws_lstm.py 出错

python 2.7.12
tensorflow 1.0.1

前面都完成了,到了第四步kcws/train/train_cws_lstm.py出错了

错误为
File "kcws/train/train_cws_lstm.py", line 276, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 44, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "kcws/train/train_cws_lstm.py", line 255, in main
with sv.managed_session(master='') as sess:
File "/usr/lib/python2.7/contextlib.py", line 17, in enter
return self.gen.next()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/supervisor.py", line 960, in managed_session
self.stop(close_summary_writer=close_summary_writer)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/supervisor.py", line 788, in stop
stop_grace_period_secs=self._stop_grace_secs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/coordinator.py", line 386, in join
six.reraise(*self._exc_info_to_raise)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/supervisor.py", line 949, in managed_session
start_standard_services=start_standard_services)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/supervisor.py", line 706, in prepare_or_wait_for_session
init_feed_dict=self._init_feed_dict, init_fn=self._init_fn)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/session_manager.py", line 256, in prepare_session
config=config)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/session_manager.py", line 188, in _restore_checkpoint
saver.restore(sess, ckpt.model_checkpoint_path)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1428, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 767, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 965, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1015, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1035, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [5692,50] rhs shape= [9835,50]
[[Node: save/Assign_25 = Assign[T=DT_FLOAT, _class=["loc:@words"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](words/Adam_1, save/RestoreV2_25)]]
代码没修改过,一步一步在命令行按照要求执行 ,是什么问题呢?

pos_lines.txt

大神好,关于您昨天新添加的词性标注模块,这个工程里面好像没有给出这个文件“pos_lines.txt”,这个“pos_lines.txt”是不是应该指向"words_train.txt"?
而且,现在demo连接不上了。-_-

关于准确率的测试

大神好,我运行标注部分的程序,最后得到了99.57%的准确率,请问这个是正确的吗?是不是太高了。。。跪求回复

关于公司名识别

您好,请问下能否提供下公司名识别的思路,还有语料是来自哪里的?谢谢。。

bazel build third_party/word2vec:word2vec Error

WARNING: /home/ml/.cache/bazel/_bazel_ml/7e44fa86fc02571bc1e4272504e8778f/external/org_tensorflow/tensorflow/workspace.bzl:13:5: path_prefix was specified to tf_workspace but is no longer used and will be removed in the future.
WARNING: /home/ml/.cache/bazel/_bazel_ml/7e44fa86fc02571bc1e4272504e8778f/external/org_tensorflow/tensorflow/workspace.bzl:15:5: tf_repo_name was specified to tf_workspace but is no longer used and will be removed in the future.
INFO: Found 1 target...
ERROR: /home/ml/py27env/kcws/third_party/word2vec/BUILD:5:1: Linking of rule '//third_party/word2vec:word2vec' failed: gcc failed: error executing command /usr/bin/gcc -o bazel-out/local-fastbuild/bin/third_party/word2vec/word2vec -Wl,-no-as-needed -B/usr/bin -B/usr/bin -pass-exit-codes '-Wl,--build-id=md5' '-Wl,--hash-style=gnu' -Wl,-S ... (remaining 1 argument(s) skipped): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1.
bazel-out/local-fastbuild/bin/third_party/word2vec/_objs/word2vec/third_party/word2vec/word2vec.pic.o: In function TrainModel': word2vec.c:(.text+0x35bd): undefined reference to pthread_create'
word2vec.c:(.text+0x3600): undefined reference to `pthread_join'
collect2: error: ld returned 1 exit status
Target //third_party/word2vec:word2vec failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 0.325s, Critical Path: 0.22s

Error during inference: Not found: FetchOutputs node Reshape_7: not found

用的tensorflow 0.12.0版本,按Readme步骤,最后生成seg_model.pbtxt

python $TF_HOME/tensorflow/python/tools/freeze_graph.py --input_graph logs/graph.pbtxt --input_checkpoint logs/model.ckpt --output_node_names "transitions,BatchMatMul_1" --output_graph seg_model.pbtxt

运行的时候提示:

./bazel-bin/kcws/cc/seg_backend_api --model_path=$(pwd)/seg_model.pbtxt  --vocab_path=$(pwd)/vocab.txt  --max_sentence_len=50 
I kcws/cc/tf_seg_model.cc:246] Loading Tensorflow.
I kcws/cc/tf_seg_model.cc:247] Making new SessionOptions.
I kcws/cc/tf_seg_model.cc:250] Got config, 0 devices
I kcws/cc/tf_seg_model.cc:253] Session created.
I kcws/cc/tf_seg_model.cc:255] Graph created.
I kcws/cc/tf_seg_model.cc:256] Reading file to proto: /Users/darren/Sources/tf/kcws/seg_model.pbtxt
I kcws/cc/tf_seg_model.cc:261] Creating session.
I kcws/cc/tf_seg_model.cc:276] End computing.
I kcws/cc/tf_seg_model.cc:282] Reading from layer transitions
I kcws/cc/tf_seg_model.cc:288] got num tag:4
I kcws/cc/tf_seg_model.cc:299] Tensorflow graph loaded from: /Users/darren/Sources/tf/kcws/vocab.txt
I kcws/cc/tf_seg_model.cc:304] Total word :5691
(2016-12-27 09:27:20) [INFO    ] Crow/0.1 server is running, local port 9090
(2016-12-27 09:27:24) [INFO    ] Request: 127.0.0.1:50005 0x7f973c81de00 HTTP/1.1 GET /
(2016-12-27 09:27:24) [INFO    ] Response: 0x7f973c81de00 / 200 0
(2016-12-27 09:27:26) [INFO    ] Request: 127.0.0.1:50005 0x7f973c81de00 HTTP/1.1 POST /tf_seg/api
WARNING: Logging before InitGoogleLogging() is written to STDERR
I1227 17:27:26.263433 209637376 seg_backend_api.cc:49] got body:
{
	"sentence" : "赵雅淇洒泪道歉 和林丹没有任何经济关系"
}


E kcws/cc/tf_seg_model.cc:435] Error during inference: Not found: FetchOutputs node Reshape_7: not found
(2016-12-27 09:27:26) [INFO    ] Response: 0x7f973c81de00 /tf_seg/api 200 0
^C(2016-12-27 09:29:23) [INFO    ] Exiting.

我生成的seg_model.pbtxt里面确实没有Reshape_7, 可能是什么原因?

$ strings seg_model.pbtxt |grep Reshape
Reshape_6/shape
Reshape_6
Reshape
Reshape_6/shape*(
Reshape_6

询问问题

当我运行 python kcws/train/train_cws_lstm.py --word2vec_path vec.txt --train_data_path ~/kcws/train.txt --test_data_path test.txt --max_sentence_len 80 --learning_rate 0.001
发生错误

``I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally
load data from: vec.txt
train data path: /home/joyivan/kcws/train.txt
Traceback (most recent call last):
File "kcws/train/train_cws_lstm.py", line 286, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 44, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "kcws/train/train_cws_lstm.py", line 256, in main
total_loss = model.loss(X, Y)
File "kcws/train/train_cws_lstm.py", line 137, in loss
P, sequence_length = self.inference(X)
File "kcws/train/train_cws_lstm.py", line 104, in inference
reuse=reuse),
TypeError: init() got an unexpected keyword argument 'reuse'

kcws/train/train_cws_lstm.py", line 256

这里报错,ValueError: Restore called with invalid save path: u'logs/model.ckpt'. File path is: u'logs/model.ckpt'。似乎是在创建Supervisor管理模型的分布式训练,保存训练的模型时候出错了,请问怎么解决?谢谢

tf0.12环境下tf.contrib.rnn.LSTMCell和tf.concat的问题

自己的环境是tf0.12,在/kcws/train/train_cws_lstm.py中:

tf.contrib.rnn.LSTMCell(self.numHidden)会报错,要换成tf.nn.rnn_cell.LSTMCell(self.numHidden)

output = tf.concat([forward_output, backward_output], 2)
会报错,要换成
output = tf.concat(2, [forward_output, backward_output])

可能是tensorflow版本的问题。

请教一个小白问题

你好,请问该如何学习该门学科啊,现在在搞Android,对深度学习感兴趣,想请教下该怎么入门?

一点微小的分词错误

你好,我想提出一点微小的错误:
挑战**创辉煌这个句子,分词结果应该是:
挑战/中/共创/辉煌
而实际结果是:
挑战/**/创辉煌
这应该是人民日报语料集中高频词汇的问题?

使用用户词库后,特定词"蓝瘦2香菇"分词异常

按照说明准备了用户词库文件,里面只包含了一个词条“蓝瘦香菇 4”。
然后使用命令加载用户词库:
./bazel-bin/kcws/cc/seg_backend_api --model_path=kcws/models/seg_model.pbtxt --vocab_path=kcws/models/basic_vocab.txt --user_dict_path=user_dict.txt --max_sentence_len=80
发现,使用“蓝瘦香菇”可以正常分词,而输入“蓝瘦2香菇”时,分词服务出现异常并退出。提示“Segmentation fault (core dumped)”错误。
如果不加载用户词库则正常。

编译后台服务时出错 tensorflow.bzl: 'check_version' is not defined

运行bazel build //kcws/cc:seg_backend_api时出错:

ERROR: /home/guan/Software/kcws/WORKSPACE:23:6: file '@org_tensorflow//tensorflow:tensorflow.bzl' does not contain symbol 'check_version'.
ERROR: /home/guan/Software/kcws/WORKSPACE:24:1: name 'check_version' is not defined.
ERROR: Error evaluating WORKSPACE file.
ERROR: error loading package 'external': Package 'external' contains errors.

python kcws/train/train_cws_lstm.py出错

python 2.7.5
tensorflow 1.0.1
前面都完成了,到了第四步kcws/train/train_cws_lstm.py出错了
(跟之前一位老兄出现的问题有点相似)

提示的错误为:
File "kcws/train/train_cws_lstm.py", line 276, in
tf.app.run()
File "/usr/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "kcws/train/train_cws_lstm.py", line 253, in main
test_unary_score, test_sequence_length = model.test_unary_score()
File "kcws/train/train_cws_lstm.py", line 175, in test_unary_score
P, sequence_length = self.inference(self.inp, reuse=True, trainMode=False)
File "kcws/train/train_cws_lstm.py", line 105, in inference
scope="RNN_forward")
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 552, in dynamic_rnn
dtype=dtype)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 719, in _dynamic_rnn_loop
swap_memory=swap_memory)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2623, in while_loop
result = context.BuildLoop(cond, body, loop_vars, shape_invariants)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2456, in BuildLoop
pred, body, original_loop_vars, loop_vars, shape_invariants)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2406, in _BuildLoop
body_result = body(*packed_vars_for_body)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 702, in _time_step
skip_conditionals=True)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 177, in _rnn_step
new_output, new_state = call_cell()
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 690, in
call_cell = lambda: cell(input_t, state)
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py", line 405, in call
reuse=self._reuse) as unit_scope:
File "/usr/lib64/python2.7/contextlib.py", line 17, in enter
return self.gen.next()
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py", line 100, in _checked_scope
"the argument reuse=True." % (scope_name, type(cell).name))
ValueError: Attempt to have a second RNNCell use the weights of a variable scope that already has weights: 'rnn_fwbw/RNN_forward/lstm_cell'; and the cell was not constructed as LSTMCell(..., reuse=True). To share the weights of an RNNCell, simply reuse it in your second calculation, or create a new one with the argument reuse=True.

之前一直按照步骤来执行的,没有修改代码,请问这是什么原因呢?谢谢~

process_anno_file issues

我自己重写了作者的语料处理脚本,生成的数据有不同,以下是diff文件的一部分,请问是不是应该把中括号也去掉?


➜  kcws git:(master) ✗ head diff.log -n 25
412c412
< ▲ 腾 讯 [ 手 机 / n   管 家 / n n t ]
---
> ▲ 腾 讯 手 机 管 家
428c428
< ( 二 ) 依 托 现 代 农 业 园 区 建 设 , 增 强 农 业 [ 综 合 / v n   生 产 能 力 / n ]
---
> ( 二 ) 依 托 现 代 农 业 园 区 建 设 , 增 强 农 业 综 合 生 产 能 力
459c459
< ( 六 ) 围 绕 美 丽 陕 西 的 目 标 , 全 力 推 进 [ 生 态 / n   环 境 / n   建 设 / v n ]
---
> ( 六 ) 围 绕 美 丽 陕 西 的 目 标 , 全 力 推 进 生 态 环 境 建 设
479c479
< 2 0 1 3 年 G D P 预 计 实 现 增 长 [ 1 1 / m   . / w   1 / m   % / w   左 右 / f ]
---
> 2 0 1 3 年 G D P 预 计 实 现 增 长 1 1 . 1 % 左 右
577c577
< 东 园 门 楼 : 红 砖 砌 筑 [ 中 西 / n   合 璧 / v i ]
---
> 东 园 门 楼 : 红 砖 砌 筑 中 西 合 璧
777c777
< 铁 路 客 流 最 高 峰 将 超 每 日 [ 1 0 / m   万 / d   人 / n ]
---
> 铁 路 客 流 最 高 峰 将 超 每 日 1 0 万 人

ps:我的脚本已经上传在fork的分支

https://github.com/Vimos/kcws/blob/master/kcws/train/process_anno_file_vimos.py

API 里面用到的那个 pbtxt 文件是怎么生成的?

希望作者能在 README 里提一下。
我试了一下执行这个命令
python tensorflow/tensorflow/python/tools/freeze_graph.py --input_graph kcws/logs/graph.pbtxt --input_checkpoint kcws/logs/checkpoint --output_node_names "transitions,BatchMatMul_1" --output_graph kcws/kcws/models/final.pbtxt
但是会报错,不知道是不是我哪个参数弄得不对。错误如下

W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Unable to open table file kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Unable to open table file kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Unable to open table file kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Unable to open table file kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Unable to open table file kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Unable to open table file kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Unable to open table file kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Unable to open table file kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Unable to open table file kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Unable to open table file kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Unable to open table file kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Unable to open table file kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Unable to open table file kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Unable to open table file kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Unable to open table file kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Unable to open table file kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Unable to open table file kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Unable to open table file kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Unable to open table file kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Unable to open table file kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Unable to open table file kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Unable to open table file kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Unable to open table file kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W tensorflow/core/framework/op_kernel.cc:975] Data loss: Unable to open table file kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
Traceback (most recent call last):
  File "tensorflow/tensorflow/python/tools/freeze_graph.py", line 135, in <module>
    tf.app.run()
  File "/Users/zhongya/anaconda/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 44, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "tensorflow/tensorflow/python/tools/freeze_graph.py", line 132, in main
    FLAGS.output_graph, FLAGS.clear_devices, FLAGS.initializer_nodes)
  File "tensorflow/tensorflow/python/tools/freeze_graph.py", line 117, in freeze_graph
    sess.run([restore_op_name], {filename_tensor_name: input_checkpoint})
  File "/Users/zhongya/anaconda/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 767, in run
    run_metadata_ptr)
  File "/Users/zhongya/anaconda/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 965, in _run
    feed_dict_string, options, run_metadata)
  File "/Users/zhongya/anaconda/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1015, in _do_run
    target_list, options, run_metadata)
  File "/Users/zhongya/anaconda/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1035, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.DataLossError: Unable to open table file kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
	 [[Node: save/RestoreV2_23 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_23/tensor_names, save/RestoreV2_23/shape_and_slices)]]

Caused by op u'save/RestoreV2_23', defined at:
  File "tensorflow/tensorflow/python/tools/freeze_graph.py", line 135, in <module>
    tf.app.run()
  File "/Users/zhongya/anaconda/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 44, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "tensorflow/tensorflow/python/tools/freeze_graph.py", line 132, in main
    FLAGS.output_graph, FLAGS.clear_devices, FLAGS.initializer_nodes)
  File "tensorflow/tensorflow/python/tools/freeze_graph.py", line 104, in freeze_graph
    _ = tf.import_graph_def(input_graph_def, name="")
  File "/Users/zhongya/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/importer.py", line 285, in import_graph_def
    op_def=op_def)
  File "/Users/zhongya/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2371, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/Users/zhongya/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1258, in __init__
    self._traceback = _extract_stack()

DataLossError (see above for traceback): Unable to open table file kcws/logs/checkpoint: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
	 [[Node: save/RestoreV2_23 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_23/tensor_names, save/RestoreV2_23/shape_and_slices)]]

char_vec.txt

大神好,还有一个问题,您在”pos_train.md“里面提到的用到之前分词系统里面的“char_vec.txt”,是不是指的我们之前的“vec.txt”

在最新的tensorflow1.0 上运行出错

File "kcws/train/train_cws_lstm.py", line 230, in inputs
features = tf.transpose(tf.stack(whole[0:FLAGS.max_sentence_len]))
AttributeError: 'module' object has no attribute 'stack'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.