Giter Site home page Giter Site logo

dalab / web2text Goto Github PK

View Code? Open in Web Editor NEW
166.0 166.0 30.0 31.22 MB

Source code for the paper "Web2Text: Deep Structured Boilerplate Removal", full paper @ ECIR'18

License: MIT License

Scala 0.16% HTML 99.37% Python 0.06% Shell 0.01% JavaScript 0.01% Perl 0.27% Makefile 0.01% TeX 0.12% Smarty 0.01% CSS 0.01% Raku 0.01% SCSS 0.01%

web2text's People

Contributors

tvogels avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

web2text's Issues

Missing python requirement

It appears the https://pypi.org/project/future/ module is missing, or I guess there's no requirements file, or incomplete in docs...

Command:
$ python src/main/python/main.py classify result\step_1_extracted_features result/step_2_classified_labels

Error:

Traceback (most recent call last):
  File "src/main/python/main.py", line 7, in <module>
    from forward import EDGE_VARIABLES, UNARY_VARIABLES, edge, loss, unary
  File "/home/lawrence/web2text/src/main/python/forward.py", line 4, in <module>
    from config import Config
  File "/home/lawrence/web2text/src/main/python/config.py", line 5, in <module>
    from future.utils import iteritems
ModuleNotFoundError: No module named 'future'

Resolution:
$ pip install future

Cannot run Recipe

Hello, I've been trying to get the base recipe running for a while and have found no success. The following is happening upon inputting the command run :

[info] Compiling 37 Scala sources to /home/jrojas/Projects/Extractors/web2text/target/scala-2.10/classes ... [error] /home/jrojas/Projects/Extractors/web2text/src/main/scala/ch/ethz/dalab/web2text/cdom/Node.scala:39:33: missing parameter type [error] val c = for (c <- children; l <- c.toString.lines) yield {" " + l} [error] ^ [error] /home/jrojas/Projects/Extractors/web2text/src/main/scala/ch/ethz/dalab/web2text/cleaneval/CleanEval.scala:140:54: value drop is not a member of java.util.stream.Stream[String] [error] val contents = if (f.startsWith("URL:")) f.lines.drop(1).mkString("\n") [error] ^ [error] /home/jrojas/Projects/Extractors/web2text/src/main/scala/ch/ethz/dalab/web2text/features/PageFeatures.scala:29:63: type mismatch; [error] found : java.util.stream.Stream[String] [error] required: Iterator[?] [error] (blockFeatureLabels.toIterator zip blockFeatures.toString.lines) [error] ^ [error] three errors found [error] (Compile / compileIncremental) Compilation failed [error] Total time: 5 s, completed Nov 12, 2019, 12:42:05 AM

System information:
OS: Ubuntu 18.04
SBT Version: 1.3.3
Scala Version: 2.10.4

Also tested with:
SBT Version: 0.13.7

Not compatible with TF 2?

$ python src/main/python/main.py classify result/step_1_extracted_features result/step_2_classified_labels 2021-10-26 21:30:40.002411: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0 Traceback (most recent call last): File "src/main/python/main.py", line 7, in <module> from forward import EDGE_VARIABLES, UNARY_VARIABLES, edge, loss, unary File "/home/ubuntu/sandbox/alan134/web2text/src/main/python/forward.py", line 2, in <module> from tensorflow import variable_scope, convert_to_tensor ImportError: cannot import name 'variable_scope' from 'tensorflow' (/home/ubuntu/anaconda3/envs/alan-env/lib/python3.8/site-packages/tensorflow/__init__.py)

Looking up "variable_scope" I find this https://www.tensorflow.org/api_docs/python/tf/compat/v1/variable_scope.

It seems to be saying that this (and apparently a number of other methods) are deprecated and need to be replaced with something like tf.compat.v1.variable_scope.

I would do this myself but I am not well versed in TF. Any chance of getting this updated to work with TF2?

Thanks,
Alan

What is "fe"?

I want to extract CleanEval data, and it might be explained on README, like this:

import ch.ethz.dalab.web2text.utilities.Util
import ch.ethz.dalab.web2text.cleaneval.CleanEval
import ch.ethz.dalab.web2text.output.CsvDatasetWriter

val data = Util.time{ CleanEval.dataset(fe) }

// Write block_features.csv and edge_features.csv
// Format of a row: page id, groundtruth label (1/0), features ...
CsvDatasetWriter.write(data, "./src/main/python/data")

// Print the names of the exported features in order
println("# Block features")
fe.blockExtractor.labels.foreach(println)
println("# Edge features")
fe.edgeExtractor.labels.foreach(println)

but I don't understand what "fe" is. Could you explain how to define "fe" ?

Release "data/cleaneval.npy" for reproductibility

Hello,

I am trying to benchmark some models against your train/test splits of the cleaneval data (thank you for releasing it as it is not available online anymore).

However, you did not released your "data/cleaneval.npy" file, which is necessary to use your src/main/python/data.py script.

The train/test splits are not replicable otherwise.

Thank you in advance.

how to improve accuracy?

I've tested on different websites so far and it is only grabbing tiny excerpts it thinks is the main content. While the text is inside the main content, it is ignoring the rest of the text that is still part of the main content.

I've used the recipe to generate the final output text. How can I tweak this so that it can grab the expected main content text?

By default, is it using pre-trained weights? How can I "teach" it so that its accuracy will improve?

So far I tested:

https://news.ycombinator = grabs only the first submission

https://openai.com/blog/openai-pytorch/ = " In the past, we implemented projects in many frameworks depending on their relative strengths. We’ve now chosen to standardize to make it easier for our team to create and share optimized implementations of our models." missing the first sentence and the rest of the text.

stuck on 2nd step of recipe: main.py

so I've been able to successfully extract features, it produces the output_.....csv

then I went inside the src/main/python directory installed, numpy, tensorflow, future with pip3

then from the root directory I ran python3 src/main/python/main.py classify output labelz

asdf@ubuntu-s-1vcpu-1gb-sfo2-01:~/web2text$ python3 src/main/python/main.py classify output labelz
/home/asdf/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/asdf/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/asdf/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/asdf/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/asdf/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/asdf/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
/home/asdf/.local/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/asdf/.local/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/asdf/.local/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/asdf/.local/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/asdf/.local/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/asdf/.local/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
WARNING:tensorflow:From /home/asdf/web2text/src/main/python/forward.py:2: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

Traceback (most recent call last):
  File "src/main/python/main.py", line 313, in <module>
    main()
  File "src/main/python/main.py", line 40, in main
    classify(file_base + '_block_features.csv', file_base + '_edge_features.csv', labels_output_file)
  File "src/main/python/main.py", line 254, in classify
    unary_logits = unary(unary_features, is_training=False)
  File "/home/asdf/web2text/src/main/python/forward.py", line 19, in unary
    c = Config()
  File "/home/asdf/web2text/src/main/python/config.py", line 14, in __init__
    root[k] = FLAGS.__getattr__(k)
  File "/home/asdf/.local/lib/python3.6/site-packages/absl/flags/_flagvalues.py", line 491, in __getattr__
    raise _exceptions.UnparsedFlagAccessError(error_message)
absl.flags._exceptions.UnparsedFlagAccessError: Trying to access flag --logtostderr before flags were parsed.

Training the model with new dataset

Hi,

I am trying to train the model with a different set of data. I am following through your steps.

I have 2 questions

  1. In the Aligning cleaned text with original source should I be passing in the source as the HTML doc and the cleaned txt for cleaned
  2. Is there a way to pick all the html files inside the input folder and dump them into one extract block and edge features csv file

KeyError: '__flags'

I did these steps:

  1. Set the CHECKPOINT_DIR variable in main.py.
  2. Make sure the files block_features.csv and edge_features.csv are in the src/main/python/data directory. Use the example from the previous section for this.
  3. Convert the CSV files to .npy with data/convert_scala_csv.py.
  4. Train the unary CNN with python3 main.py train_unary.

but, the fourth step failed with this error:

  File "main.py", line 262, in <module>
    main()
  File "main.py", line 28, in main
    train_unary()
  File "main.py", line 83, in train_unary
    dropout_keep_prob=DROPOUT_KEEP_PROB)
  File "/root/work/web2text/src/main/python/forward.py", line 19, in unary
    c = Config()
  File "/root/work/web2text/src/main/python/config.py", line 12, in __init__
    for k, v in iteritems(FLAGS.__dict__['__flags']):
KeyError: '__flags'

Could you tell me how to fix it?

My tensorflow version is 1.10.0

unresolved dependency: ch.ethz.dalab#dissolvestruct_2.10;0.1-SNAPSHOT: not found

when I import this project used idea, the IDE show the error message as follow:

[error] sbt.librarymanagement.ResolveException: unresolved dependency: ch.ethz.dalab#dissolvestruct_2.10;0.1-SNAPSHOT: not found
[error] at sbt.internal.librarymanagement.IvyActions$.resolveAndRetrieve(IvyActions.scala:331)
[error] at sbt.internal.librarymanagement.IvyActions$.$anonfun$updateEither$1(IvyActions.scala:205)
[error] at sbt.internal.librarymanagement.IvySbt$Module.$anonfun$withModule$1(Ivy.scala:229)
[error] at sbt.internal.librarymanagement.IvySbt.$anonfun$withIvy$1(Ivy.scala:190)
[error] at sbt.internal.librarymanagement.IvySbt.sbt$internal$librarymanagement$IvySbt$$action$1(Ivy.scala:70)
[error] at sbt.internal.librarymanagement.IvySbt$$anon$3.call(Ivy.scala:77)
[error] at xsbt.boot.Locks$GlobalLock.withChannel$1(Locks.scala:93)
[error] at xsbt.boot.Locks$GlobalLock.xsbt$boot$Locks$GlobalLock$$withChannelRetries$1(Locks.scala:78)
[error] at xsbt.boot.Locks$GlobalLock$$anonfun$withFileLock$1.apply(Locks.scala:97)
[error] at xsbt.boot.Using$.withResource(Using.scala:10)
[error] at xsbt.boot.Using$.apply(Using.scala:9)
[error] at xsbt.boot.Locks$GlobalLock.ignoringDeadlockAvoided(Locks.scala:58)
[error] at xsbt.boot.Locks$GlobalLock.withLock(Locks.scala:48)
[error] at xsbt.boot.Locks$.apply0(Locks.scala:31)
[error] at xsbt.boot.Locks$.apply(Locks.scala:28)
[error] at sbt.internal.librarymanagement.IvySbt.withDefaultLogger(Ivy.scala:77)
[error] at sbt.internal.librarymanagement.IvySbt.withIvy(Ivy.scala:185)
[error] at sbt.internal.librarymanagement.IvySbt.withIvy(Ivy.scala:182)
[error] at sbt.internal.librarymanagement.IvySbt$Module.withModule(Ivy.scala:228)
[error] at sbt.internal.librarymanagement.IvyActions$.updateEither(IvyActions.scala:190)
[error] at sbt.librarymanagement.ivy.IvyDependencyResolution.update(IvyDependencyResolution.scala:20)
[error] at sbt.librarymanagement.DependencyResolution.update(DependencyResolution.scala:56)
[error] at sbt.internal.LibraryManagement$.resolve$1(LibraryManagement.scala:38)
[error] at sbt.internal.LibraryManagement$.$anonfun$cachedUpdate$12(LibraryManagement.scala:91)
[error] at sbt.util.Tracked$.$anonfun$lastOutput$1(Tracked.scala:68)
[error] at sbt.internal.LibraryManagement$.$anonfun$cachedUpdate$19(LibraryManagement.scala:104)
[error] at scala.util.control.Exception$Catch.apply(Exception.scala:224)
[error] at sbt.internal.LibraryManagement$.$anonfun$cachedUpdate$11(LibraryManagement.scala:104)
[error] at sbt.internal.LibraryManagement$.$anonfun$cachedUpdate$11$adapted(LibraryManagement.scala:87)
[error] at sbt.util.Tracked$.$anonfun$inputChanged$1(Tracked.scala:149)
[error] at sbt.internal.LibraryManagement$.cachedUpdate(LibraryManagement.scala:118)
[error] at sbt.Classpaths$.$anonfun$updateTask$5(Defaults.scala:2353)
[error] at scala.Function1.$anonfun$compose$1(Function1.scala:44)
[error] at sbt.internal.util.$tilde$greater.$anonfun$$u2219$1(TypeFunctions.scala:42)
[error] at sbt.std.Transform$$anon$4.work(System.scala:64)
[error] at sbt.Execute.$anonfun$submit$2(Execute.scala:257)
[error] at sbt.internal.util.ErrorHandling$.wideConvert(ErrorHandling.scala:16)
[error] at sbt.Execute.work(Execute.scala:266)
[error] at sbt.Execute.$anonfun$submit$1(Execute.scala:257)
[error] at sbt.ConcurrentRestrictions$$anon$4.$anonfun$submitValid$1(ConcurrentRestrictions.scala:167)
[error] at sbt.CompletionService$$anon$2.call(CompletionService.scala:32)
[error] at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[error] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
[error] at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[error] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[error] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[error] at java.lang.Thread.run(Thread.java:745)
[error] sbt.librarymanagement.ResolveException: unresolved dependency: ch.ethz.dalab#dissolvestruct_2.10;0.1-SNAPSHOT: not found
[error] at sbt.internal.librarymanagement.IvyActions$.resolveAndRetrieve(IvyActions.scala:331)
[error] at sbt.internal.librarymanagement.IvyActions$.$anonfun$updateEither$1(IvyActions.scala:205)
[error] at sbt.internal.librarymanagement.IvySbt$Module.$anonfun$withModule$1(Ivy.scala:229)
[error] at sbt.internal.librarymanagement.IvySbt.$anonfun$withIvy$1(Ivy.scala:190)
[error] at sbt.internal.librarymanagement.IvySbt.sbt$internal$librarymanagement$IvySbt$$action$1(Ivy.scala:70)
[error] at sbt.internal.librarymanagement.IvySbt$$anon$3.call(Ivy.scala:77)
[error] at xsbt.boot.Locks$GlobalLock.withChannel$1(Locks.scala:93)
[error] at xsbt.boot.Locks$GlobalLock.xsbt$boot$Locks$GlobalLock$$withChannelRetries$1(Locks.scala:78)
[error] at xsbt.boot.Locks$GlobalLock$$anonfun$withFileLock$1.apply(Locks.scala:97)
[error] at xsbt.boot.Using$.withResource(Using.scala:10)
[error] at xsbt.boot.Using$.apply(Using.scala:9)
[error] at xsbt.boot.Locks$GlobalLock.ignoringDeadlockAvoided(Locks.scala:58)
[error] at xsbt.boot.Locks$GlobalLock.withLock(Locks.scala:48)
[error] at xsbt.boot.Locks$.apply0(Locks.scala:31)
[error] at xsbt.boot.Locks$.apply(Locks.scala:28)
[error] at sbt.internal.librarymanagement.IvySbt.withDefaultLogger(Ivy.scala:77)
[error] at sbt.internal.librarymanagement.IvySbt.withIvy(Ivy.scala:185)
[error] at sbt.internal.librarymanagement.IvySbt.withIvy(Ivy.scala:182)
[error] at sbt.internal.librarymanagement.IvySbt$Module.withModule(Ivy.scala:228)
[error] at sbt.internal.librarymanagement.IvyActions$.updateEither(IvyActions.scala:190)
[error] at sbt.librarymanagement.ivy.IvyDependencyResolution.update(IvyDependencyResolution.scala:20)
[error] at sbt.librarymanagement.DependencyResolution.update(DependencyResolution.scala:56)
[error] at sbt.internal.LibraryManagement$.resolve$1(LibraryManagement.scala:38)
[error] at sbt.internal.LibraryManagement$.$anonfun$cachedUpdate$12(LibraryManagement.scala:91)
[error] at sbt.util.Tracked$.$anonfun$lastOutput$1(Tracked.scala:68)
[error] at sbt.internal.LibraryManagement$.$anonfun$cachedUpdate$19(LibraryManagement.scala:104)
[error] at scala.util.control.Exception$Catch.apply(Exception.scala:224)
[error] at sbt.internal.LibraryManagement$.$anonfun$cachedUpdate$11(LibraryManagement.scala:104)
[error] at sbt.internal.LibraryManagement$.$anonfun$cachedUpdate$11$adapted(LibraryManagement.scala:87)
[error] at sbt.util.Tracked$.$anonfun$inputChanged$1(Tracked.scala:149)
[error] at sbt.internal.LibraryManagement$.cachedUpdate(LibraryManagement.scala:118)
[error] at sbt.Classpaths$.$anonfun$updateTask$5(Defaults.scala:2353)
[error] at scala.Function1.$anonfun$compose$1(Function1.scala:44)
[error] at sbt.internal.util.$tilde$greater.$anonfun$$u2219$1(TypeFunctions.scala:42)
[error] at sbt.std.Transform$$anon$4.work(System.scala:64)
[error] at sbt.Execute.$anonfun$submit$2(Execute.scala:257)
[error] at sbt.internal.util.ErrorHandling$.wideConvert(ErrorHandling.scala:16)
[error] at sbt.Execute.work(Execute.scala:266)
[error] at sbt.Execute.$anonfun$submit$1(Execute.scala:257)
[error] at sbt.ConcurrentRestrictions$$anon$4.$anonfun$submitValid$1(ConcurrentRestrictions.scala:167)
[error] at sbt.CompletionService$$anon$2.call(CompletionService.scala:32)
[error] at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[error] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
[error] at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[error] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[error] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[error] at java.lang.Thread.run(Thread.java:745)
[error] (:update) sbt.librarymanagement.ResolveException: unresolved dependency: ch.ethz.dalab#dissolvestruct_2.10;0.1-SNAPSHOT: not found
[error] (
:ssExtractDependencies) sbt.librarymanagement.ResolveException: unresolved dependency: ch.ethz.dalab#dissolvestruct_2.10;0.1-SNAPSHOT: not found
[error] Total time: 19 s, completed Apr 7, 2018 9:57:39 PM

how can i solve this error?

Error while Installing/Compiling the requirements

Hello,

I am encountering an error when installing your requirements. You stated that you tested with SBT 0.31. According to the same link you provide, such version of SBT is no longer available (or it even seems it never existed)

Therefore I just downloaded the most recent version from here.
Then I run the code as you suggest with the following command

sbt "runMain ch.ethz.dalab.web2text.ExtractPageFeatures index.html desired_output"

Sadly after some downloads, I encountered the following errors:

[error] web2text/cdom/Node.scala:39:33: missing parameter type

[error] val c = for (c <- children; l <- c.toString.lines) yield {"  " + l}
                            ^
[error] web2text/src/main/scala/ch/ethz/dalab/web2text/cleaneval/CleanEval.scala:140:54: value drop is not a member of java.util.stream.Stream[String]

[error] val contents = if (f.startsWith("URL:")) f.lines.drop(1).mkString("\n")
                                            ^
[error] web2text/src/main/scala/ch/ethz/dalab/web2text/features/PageFeatures.scala:29:63: type mismatch;

[error] found   : java.util.stream.Stream[String]

[error] required: Iterator[?]

[error] (blockFeatureLabels.toIterator zip blockFeatures.toString.lines)               
                                         
[error] three errors found

[error] (Compile / compileIncremental) Compilation failed

[error] Total time: 6 s, completed Dec 23, 2020, 5:07:31 PM

How else can I compile your code? Is there any docker image you made back then that it is possible to run today?

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.