treygrainger / ai-powered-search Goto Github PK
View Code? Open in Web Editor NEWThe codebase for the book "AI-Powered Search" (Manning Publications, 2024)
Home Page: https://aipoweredsearch.com
The codebase for the book "AI-Powered Search" (Manning Publications, 2024)
Home Page: https://aipoweredsearch.com
Enjoying the book, got the following error when running docker-compose up for the first time:
failed to solve: rpc error: code = Unknown desc = executor failed running [/bin/bash -o pipefail -c python -m pip install --upgrade pip && pip install -r requirements.txt]: exit code: 137
How do I solve this please. Thanks Rob
docker build error.txt
Hi,
I do not succeed to persist the solr data :(
I added to ai-powered-search/docker/solr/docker-compose.yml
the following:
volumes:
aips-solr-data:
name: aips-solr-data
external: true
And I created the volume - output of docker volume inspect aips-solr-data
is:
[
{
"CreatedAt": "2021-11-04T14:37:07+01:00",
"Driver": "local",
"Labels": {},
"Mountpoint": "/var/lib/docker/volumes/aips-solr-data/_data",
"Name": "aips-solr-data",
"Options": {},
"Scope": "local"
}
]
Any hints/helps?
Thanks in advance!
Because https://github.com/ai-powered-search/outdoors.git doesn't contain a single tar file, I changed the following line in notebook ch5/2.index-datasets
! cd outdoors && mkdir -p '../../data/outdoors/' && tar -xvf outdoors.tgz -C '../../data/outdoors/'
to
! cd outdoors && mkdir -p '../../data/outdoors/' && cat outdoors.tgz* | tar -xz && cp posts.csv ../../data/outdoors
The download and notebook seem to work now but I am not sure if my solution is appropriate?
I was able to run docker-compose up
and access the notebook but the solr container failed to start.
welcome.ipynb
with the error messageError! One or more containers are not responding.
Please follow the instructions in Appendix A.
$ docker-compose down && docker-compose up
Removing aips-data-science ...
Removing aips-solr ...
...
Creating aips-data-science ...
Creating aips-data-science ... done
Attaching to aips-zk, aips-solr, aips-data-science
aips-solr | /bin/sh: 0: Can't open <-------------- cannot open
aips-zk | ZooKeeper JMX enabled by default
...
ps-zk | 2021-05-09 18:08:19,839 [myid:1] - WARN [main:QuorumPeerMain@125] - Either no config or no quorum defined in config, running in standalone mode
aips-solr exited with code 127 <-------------- failed with code error 127
...
I am using Windows10 Docker Desktop v20.10.5 and integration with WSL 2 distro Ubuntu-18.04.
I tried to ran the docker-compose up inside both the WSL, Powershell and Git Bash, all failed to start Solr.
Anyone else with this error message?
The notebook defines a bunch of different functions but doesn't show how to use them
The cell 8 returns <Response [405]>
, and cell 9 doesn't find file data/product_judgments.txt
Has anyone seen this error / warning message before? Any ideas on how to fix it?
aips-solr | 2022-01-16 15:44:19.671 INFO (qtp1962329560-20) [ ] o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/metrics params={wt=javabin&version=2&key=solr.jvm:os.processCpuLoad&key=solr.node:CONTAINER.fs.coreRoot.usableSpace&key=solr.jvm:os.systemLoadAverage&key=solr.jvm:memory.heap.used} status=0 QTime=26
Full docker-compose up logs attached.
docker jvm memory prob.txt
Or at least switch to 8.11.x to match the supported versions policy
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
File ~/notebooks/ch10/../ltr/judgments.py:40, in judgments_open(path, mode)
39 try:
---> 40 f=open(path, mode)
41 if mode[0] == 'r':
FileNotFoundError: [Errno 2] No such file or directory: 'data/ai_pow_search_judgments.txt'
During handling of the above exception, another exception occurred:
UnboundLocalError Traceback (most recent call last)
Cell In[10], line 8
4 from ltr import download
6 ftr_logger=FeatureLogger(client, index='tmdb', feature_set='movies')
----> 8 with judgments_open('data/ai_pow_search_judgments.txt') as judgment_list:
9 for qid, query_judgments in groupby(judgment_list, key=lambda j: j.qid):
10 ftr_logger.log_for_qid(judgments=query_judgments,
11 qid=qid,
12 keywords=judgment_list.keywords(qid))
File /opt/conda/lib/python3.10/contextlib.py:135, in _GeneratorContextManager.__enter__(self)
133 del self.args, self.kwds, self.func
134 try:
--> 135 return next(self.gen)
136 except StopIteration:
137 raise RuntimeError("generator didn't yield") from None
File ~/notebooks/ch10/../ltr/judgments.py:48, in judgments_open(path, mode)
46 writer.flush()
47 finally:
---> 48 f.close()
UnboundLocalError: local variable 'f' referenced before assignment
Right now we have two slightly different implementations of index_reviews_collection
in the source code (cell 10 & 11). Second is failing because of the file name, but it has more columns
I get the following error when running healthcheck. I tried doing this as a reply on Manning but ran into a 5K char limit as well as posting consecutive comments so I put my issue here.
Error! One or more containers are not responding.
Please follow the instructions in Appendix A.
I am running this on Windows 10 using Windows Terminal/Powershell
From the command line I ran this to check the docker-compose version
PS C:\kendevelopment\ai-powered-search\docker> docker-compose --version
docker-compose version 1.29.2, build 5becea4c
Then I run docker-compose up and here's the output.
PS C:\kendevelopment\ai-powered-search\docker> docker-compose up
Creating network "docker_zk-solr" with the default driver
Creating network "docker_solr-data-science" with the default driver
Creating aips-zk ... done
Creating aips-solr ... done
Creating aips-data-science ... done
Attaching to aips-zk, aips-solr, aips-data-science
: No such file /bin/sh: 0: cannot open
aips-zk | ZooKeeper JMX enabled by default
aips-zk | Using config: /conf/zoo.cfg
aips-zk | 2022-04-01 14:00:43,938 [myid:] - INFO [main:QuorumPeerConfig@133] - Reading configuration from: /conf/zoo.cfg
aips-zk | 2022-04-01 14:00:43,945 [myid:] - INFO [main:QuorumPeerConfig@375] - clientPort is not set
aips-zk | 2022-04-01 14:00:43,945 [myid:] - INFO [main:QuorumPeerConfig@389] - secureClientPort is not set
aips-zk | 2022-04-01 14:00:43,954 [myid:] - ERROR [main:QuorumPeerConfig@645] - Invalid configuration, only one server specified (ignoring)
aips-zk | 2022-04-01 14:00:43,960 [myid:1] - INFO [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 3
aips-zk | 2022-04-01 14:00:43,961 [myid:1] - INFO [main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 0
aips-zk | 2022-04-01 14:00:43,961 [myid:1] - INFO [main:DatadirCleanupManager@101] - Purge task is not scheduled.
aips-zk | 2022-04-01 14:00:43,961 [myid:1] - WARN [main:QuorumPeerMain@125] - Either no config or no quorum defined in config, running in standalone mode
aips-zk | 2022-04-01 14:00:43,964 [myid:1] - INFO [main:ManagedUtil@46] - Log4j found with jmx enabled.
aips-zk | 2022-04-01 14:00:43,976 [myid:1] - INFO [main:QuorumPeerConfig@133] - Reading configuration from: /conf/zoo.cfg
aips-zk | 2022-04-01 14:00:43,976 [myid:1] - INFO [main:QuorumPeerConfig@375] - clientPort is not set
aips-zk | 2022-04-01 14:00:43,977 [myid:1] - INFO [main:QuorumPeerConfig@389] - secureClientPort is not set
aips-zk | 2022-04-01 14:00:43,977 [myid:1] - ERROR [main:QuorumPeerConfig@645] - Invalid configuration, only one server specified (ignoring)
aips-zk | 2022-04-01 14:00:43,977 [myid:1] - INFO [main:ZooKeeperServerMain@117] - Starting server
aips-zk | 2022-04-01 14:00:44,040 [myid:1] - INFO [main:Environment@109] - Server environment:zookeeper.version=3.5.5-390fe37ea45dee01bf87dc1c042b5e3dcce88653, built on 05/03/2019 12:07 GMT
aips-zk | 2022-04-01 14:00:44,040 [myid:1] - INFO [main:Environment@109] - Server environment:host.name=aips-zk
aips-zk | 2022-04-01 14:00:44,042 [myid:1] - INFO [main:Environment@109] - Server environment:java.version=1.8.0_232
aips-zk | 2022-04-01 14:00:44,042 [myid:1] - INFO [main:Environment@109] - Server environment:java.vendor=Oracle Corporation
aips-zk | 2022-04-01 14:00:44,042 [myid:1] - INFO [main:Environment@109] - Server environment:java.home=/usr/local/openjdk-8
aips-zk | 2022-04-01 14:00:44,043 [myid:1] - INFO [main:Environment@109] - Server environment:java.class.path=/apache-zookeeper-3.5.5-bin/bin/../zookeeper-server/target/classes:/apache-zookeeper-3.5.5-bin/bin/../build/classes:/apache-zookeeper-3.5.5-bin/bin/../zookeeper-server/target/lib/.jar:/apache-zookeeper-3.5.5-bin/bin/../build/lib/.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/zookeeper-jute-3.5.5.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/zookeeper-3.5.5.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/slf4j-log4j12-1.7.25.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/slf4j-api-1.7.25.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/netty-all-4.1.29.Final.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/log4j-1.2.17.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/json-simple-1.1.1.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/jline-2.11.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/jetty-util-9.4.17.v20190418.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/jetty-servlet-9.4.17.v20190418.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/jetty-server-9.4.17.v20190418.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/jetty-security-9.4.17.v20190418.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/jetty-io-9.4.17.v20190418.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/jetty-http-9.4.17.v20190418.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/javax.servlet-api-3.1.0.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/jackson-databind-2.9.8.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/jackson-core-2.9.8.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/jackson-annotations-2.9.0.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/commons-cli-1.2.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/audience-annotations-0.5.0.jar:/apache-zookeeper-3.5.5-bin/bin/../zookeeper-.jar:/apache-zookeeper-3.5.5-bin/bin/../zookeeper-server/src/main/resources/lib/.jar:/conf:
aips-zk | 2022-04-01 14:00:44,043 [myid:1] - INFO [main:Environment@109] - Server environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
aips-zk | 2022-04-01 14:00:44,043 [myid:1] - INFO [main:Environment@109] - Server environment:java.io.tmpdir=/tmp
aips-zk | 2022-04-01 14:00:44,043 [myid:1] - INFO [main:Environment@109] - Server environment:java.compiler=
aips-zk | 2022-04-01 14:00:44,043 [myid:1] - INFO [main:Environment@109] - Server environment:os.name=Linux
aips-zk | 2022-04-01 14:00:44,043 [myid:1] - INFO [main:Environment@109] - Server environment:os.arch=amd64
aips-zk | 2022-04-01 14:00:44,043 [myid:1] - INFO [main:Environment@109] - Server environment:os.version=5.10.102.1-microsoft-standard-WSL2
aips-zk | 2022-04-01 14:00:44,043 [myid:1] - INFO [main:Environment@109] - Server environment:user.name=zookeeper
aips-zk | 2022-04-01 14:00:44,044 [myid:1] - INFO [main:Environment@109] - Server environment:user.home=/home/zookeeper
aips-zk | 2022-04-01 14:00:44,044 [myid:1] - INFO [main:Environment@109] - Server environment:user.dir=/apache-zookeeper-3.5.5-bin
aips-zk | 2022-04-01 14:00:44,044 [myid:1] - INFO [main:Environment@109] - Server environment:os.memory.free=367MB
aips-zk | 2022-04-01 14:00:44,044 [myid:1] - INFO [main:Environment@109] - Server environment:os.memory.max=889MB
aips-zk | 2022-04-01 14:00:44,044 [myid:1] - INFO [main:Environment@109] - Server environment:os.memory.total=379MB
aips-zk | 2022-04-01 14:00:44,048 [myid:1] - INFO [main:ZooKeeperServer@938] - minSessionTimeout set to 4000
aips-zk | 2022-04-01 14:00:44,048 [myid:1] - INFO [main:ZooKeeperServer@947] - maxSessionTimeout set to 40000
aips-zk | 2022-04-01 14:00:44,049 [myid:1] - INFO [main:ZooKeeperServer@166] - Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 40000 datadir /datalog/version-2 snapdir /data/version-2
aips-zk | 2022-04-01 14:00:44,084 [myid:1] - INFO [main:Log@193] - Logging initialized @772ms to org.eclipse.jetty.util.log.Slf4jLog
aips-zk | 2022-04-01 14:00:44,184 [myid:1] - WARN [main:ContextHandler@1588] - o.e.j.s.ServletContextHandler@cb644e{/,null,UNAVAILABLE} contextPath ends with /*
aips-zk | 2022-04-01 14:00:44,184 [myid:1] - WARN [main:ContextHandler@1599] - Empty contextPath
aips-zk | 2022-04-01 14:00:44,198 [myid:1] - INFO [main:Server@370] - jetty-9.4.17.v20190418; built: 2019-04-18T19:45:35.259Z; git: aa1c656c315c011c01e7b21aabb04066635b9f67; jvm 1.8.0_232-b09
aips-zk | 2022-04-01 14:00:44,259 [myid:1] - INFO [main:DefaultSessionIdManager@365] - DefaultSessionIdManager workerName=node0
aips-zk | 2022-04-01 14:00:44,259 [myid:1] - INFO [main:DefaultSessionIdManager@370] - No SessionScavenger set, using defaults
aips-zk | 2022-04-01 14:00:44,262 [myid:1] - INFO [main:HouseKeeper@149] - node0 Scavenging every 600000ms
aips-zk | 2022-04-01 14:00:44,297 [myid:1] - INFO [main:ContextHandler@855] - Started o.e.j.s.ServletContextHandler@cb644e{/,null,AVAILABLE}
aips-zk | 2022-04-01 14:00:44,327 [myid:1] - INFO [main:AbstractConnector@292] - Started ServerConnector@100fc185{HTTP/1.1,[http/1.1]}{0.0.0.0:8080}
aips-zk | 2022-04-01 14:00:44,327 [myid:1] - INFO [main:Server@410] - Started @1031ms
aips-zk | 2022-04-01 14:00:44,328 [myid:1] - INFO [main:JettyAdminServer@112] - Started AdminServer on address 0.0.0.0, port 8080 and command URL /commands
aips-zk | 2022-04-01 14:00:44,333 [myid:1] - INFO [main:ServerCnxnFactory@135] - Using org.apache.zookeeper.server.NIOServerCnxnFactory as server connection factory
aips-zk | 2022-04-01 14:00:44,338 [myid:1] - INFO [main:NIOServerCnxnFactory@673] - Configuring NIO connection handler with 10s sessionless connection timeout, 2 selector thread(s), 16 worker threads, and 64 kB direct buffers.
aips-zk | 2022-04-01 14:00:44,339 [myid:1] - INFO [main:NIOServerCnxnFactory@686] - binding to port /0.0.0.0:2181
aips-zk | 2022-04-01 14:00:44,356 [myid:1] - INFO [main:ZKDatabase@117] - zookeeper.snapshotSizeFactor = 0.33
aips-zk | 2022-04-01 14:00:44,359 [myid:1] - INFO [main:FileTxnSnapLog@372] - Snapshotting: 0x0 to /data/version-2/snapshot.0
aips-zk | 2022-04-01 14:00:44,362 [myid:1] - INFO [main:FileTxnSnapLog@372] - Snapshotting: 0x0 to /data/version-2/snapshot.0
aips-zk | 2022-04-01 14:00:44,376 [myid:1] - INFO [main:ContainerManager@64] - Using checkIntervalMs=60000 maxPerMinute=10000
aips-solr exited with code 2
aips-data-science | [I 14:00:46.559 NotebookApp] Writing notebook server cookie secret to /home/jovyan/.local/share/jupyter/runtime/notebook_cookie_secret
aips-data-science | [W 14:00:47.634 NotebookApp] All authentication is disabled. Anyone who can connect to this server will be able to run code.
aips-data-science | [W 14:00:47.659 NotebookApp] Error loading server extension jupyterlab
aips-data-science | Traceback (most recent call last):
aips-data-science | File "/home/jovyan/.local/lib/python3.7/site-packages/notebook/notebookapp.py", line 1572, in init_server_extensions
aips-data-science | mod = importlib.import_module(modulename)
aips-data-science | File "/opt/conda/lib/python3.7/importlib/init.py", line 127, in import_module
aips-data-science | return _bootstrap._gcd_import(name[level:], package, level)
aips-data-science | File "", line 1006, in _gcd_import
aips-data-science | File "", line 983, in _find_and_load
aips-data-science | File "", line 967, in _find_and_load_unlocked
aips-data-science | File "", line 677, in _load_unlocked
aips-data-science | File "", line 728, in exec_module
aips-data-science | File "", line 219, in _call_with_frames_removed
aips-data-science | File "/opt/conda/lib/python3.7/site-packages/jupyterlab/init.py", line 7, in
aips-data-science | from .labapp import LabApp
aips-data-science | File "/opt/conda/lib/python3.7/site-packages/jupyterlab/labapp.py", line 15, in
aips-data-science | from jupyter_server.serverapp import flags
aips-data-science | File "/opt/conda/lib/python3.7/site-packages/jupyter_server/serverapp.py", line 40, in
aips-data-science | from jupyter_core.paths import secure_write
aips-data-science | ImportError: cannot import name 'secure_write' from 'jupyter_core.paths' (/home/jovyan/.local/lib/python3.7/site-packages/jupyter_core/paths.py)
aips-data-science | [I 14:00:47.664 NotebookApp] Serving notebooks from local directory: /home/jovyan/notebooks
aips-data-science | [I 14:00:47.664 NotebookApp] The Jupyter Notebook is running at:
aips-data-science | [I 14:00:47.664 NotebookApp] http://(4214d2bd0462 or 127.0.0.1):8888/
aips-data-science | [I 14:00:47.664 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
Everything looks OK except for an "aips-solr exited with code 2" message. I would think solr should be running.
In a separate Windows Terminal windows I run docker-compose ps
aips-data-science tini -g -- /bin/bash -o pi ... Up 0.0.0.0:2345->2345/tcp, 0.0.0.0:7077->7077/tcp, 0.0.0.0:8082->8080/tcp, 0.0.0.0:8081->8081/tcp, 0.0.0.0:8888->8888/tcp
aips-solr /bin/sh -c "./run_solr_w_l ... Exit 2
aips-zk /docker-entrypoint.sh zkSe ... Up 0.0.0.0:2181->2128/tcp, 2181/tcp, 2888/tcp, 3888/tcp, 8080/tcp
PS C:\kendevelopment\ai-powered-search\docker>
aips-solr with a state of Exit 2 is probably not good. Then went to http://localhost:8888/notebooks/welcome.ipynb
And ran the Healthcheck
Error! One or more containers are not responding.
Please follow the instructions in Appendix A.
Thanks for your help. Thanks for writing this as I'm really motivated to learn this.
Howdy, any idea on how I can get around this final issue with the install step? Thanks! I ran this immediately after cloning the repo.
jeff@Jeffreys-iMac docker % docker-compose up
Building notebooks
Step 1/30 : FROM jupyter/scipy-notebook:2021-11-04
---> 8255b7a7b41e
Step 2/30 : USER root
---> Using cache
---> 603875557610
Step 3/30 : RUN sudo apt-get update && apt-get install -y --reinstall build-essential
---> Using cache
---> 4aca572c5f97
Step 4/30 : ENV APACHE_SPARK_VERSION=2.4.7 HADOOP_VERSION=2.7 SPARK_SOLR_VERSION=3.8.0
---> Using cache
---> f53298cb7313
Step 5/30 : RUN apt-get -y update && apt-get install --no-install-recommends -y openjdk-8-jre-headless ca-certificates-java && rm -rf /var/lib/apt/lists/*
---> Using cache
---> 9946e753dd7d
Step 6/30 : RUN conda install python=3.7.12
---> Using cache
---> 029bacab755d
Step 7/30 : COPY pull_aips_dependency.py pull_aips_dependency.py
---> Using cache
---> 8c11a388abe1
Step 8/30 : RUN python pull_aips_dependency.py spark-${APACHE_SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz && tar xzf spark-${APACHE_SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz -C /usr/local --owner root --group root --no-same-owner && rm spark-${APACHE_SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz
---> Using cache
---> a4d55e5f7492
Step 9/30 : RUN cd /usr/local && ln -s spark-${APACHE_SPARK_VERSION}-bin-hadoop${HADOOP_VERSION} spark
---> Using cache
---> 0de66e85c2f1
Step 10/30 : ENV SPARK_HOME=/usr/local/spark
---> Using cache
---> f26c8187d9ad
Step 11/30 : ENV PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/lib/py4j-0.10.7-src.zip SPARK_OPTS="--driver-java-options=-Xms1024M --driver-java-options=-Xmx4096M --spark.driver.extraLibraryPath=/usr/local/spark/lib/spark-solr-${SPARK_SOLR_VERSION}-shaded.jar --spark.executor.extraLibraryPath=/usr/local/spark/lib/spark-solr-${SPARK_SOLR_VERSION}-shaded.jar --driver-java-options=-Dlog4j.logLevel=info" PATH=$PATH:$SPARK_HOME/bin
---> Using cache
---> a87bf90e4302
Step 12/30 : ENV SPARK_CLASSPATH=$SPARK_CLASSPATH:/usr/local/spark/lib/spark-solr-${SPARK_SOLR_VERSION}-shaded.jar
---> Using cache
---> 80247e5f54ac
Step 13/30 : ENV PYSPARK_SUBMIT_ARGS="--jars /usr/local/spark/lib/spark-solr-${SPARK_SOLR_VERSION}-shaded.jar"
---> Using cache
---> d0dd4e045c79
Step 14/30 : Run echo $SPARK_HOME
---> Using cache
---> 2bbc1fc08916
Step 15/30 : Run mkdir /usr/local/spark/lib/ && cd /usr/local/spark/lib/ && wget -q https://repo1.maven.org/maven2/com/lucidworks/spark/spark-solr/${SPARK_SOLR_VERSION}/spark-solr-${SPARK_SOLR_VERSION}-shaded.jar && echo "3bd0614d50ce6ef2769eb0d654e58fd68cf3e1f63c567dca8b12432a7e6ac907753b289f6d3cca5a80a67454d6ff841e438f53472cba37530293548751edaa8f *spark-solr-${SPARK_SOLR_VERSION}-shaded.jar" | sha512sum -c - && export EXTRA_CLASSPATH=/usr/local/spark/lib/spark-solr-${SPARK_SOLR_VERSION}-shaded.jar && $SPARK_HOME/bin/spark-shell --jars spark-solr-${SPARK_SOLR_VERSION}-shaded.jar
---> Using cache
---> ac5858d662ed
Step 16/30 : Run chmod a+rwx /usr/local/spark/lib/spark-solr-${SPARK_SOLR_VERSION}-shaded.jar
---> Using cache
---> bf4d54255f2a
Step 17/30 : COPY notebooks notebooks
---> Using cache
---> 13738a4630e6
Step 18/30 : RUN chown -R $NB_UID:$NB_UID /home/$NB_USER
---> Using cache
---> a00e63fb40b0
Step 19/30 : USER $NB_UID
---> Using cache
---> f15a2cc6aabd
Step 20/30 : WORKDIR /home/$NB_USER
---> Using cache
---> 3f0ecb34d427
Step 21/30 : COPY requirements.txt ./
---> 00d45e820595
Step 22/30 : RUN python -m pip install --upgrade pip && pip install -r requirements.txt
---> Running in 0d99e2cee0bb
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: pip in /opt/conda/lib/python3.7/site-packages (21.3.1)
Defaulting to user installation because normal site-packages is not writeable
Collecting appnope==0.1.0
Downloading appnope-0.1.0-py2.py3-none-any.whl (4.0 kB)
Collecting attrs==19.1.0
Downloading attrs-19.1.0-py2.py3-none-any.whl (35 kB)
Collecting backcall==0.1.0
Downloading backcall-0.1.0.zip (11 kB)
Preparing metadata (setup.py): started
Preparing metadata (setup.py): finished with status 'done'
Collecting bleach==3.1.4
Downloading bleach-3.1.4-py2.py3-none-any.whl (151 kB)
Collecting bs4==0.0.1
Downloading bs4-0.0.1.tar.gz (1.1 kB)
Preparing metadata (setup.py): started
Preparing metadata (setup.py): finished with status 'done'
Collecting certifi==2019.6.16
Downloading certifi-2019.6.16-py2.py3-none-any.whl (157 kB)
Collecting chardet==3.0.4
Downloading chardet-3.0.4-py2.py3-none-any.whl (133 kB)
Collecting cython==0.29.20
Downloading Cython-0.29.20-cp37-cp37m-manylinux1_x86_64.whl (2.0 MB)
Collecting decorator==4.4.0
Downloading decorator-4.4.0-py2.py3-none-any.whl (8.3 kB)
Collecting defusedxml==0.6.0
Downloading defusedxml-0.6.0-py2.py3-none-any.whl (23 kB)
Requirement already satisfied: entrypoints==0.3 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 11)) (0.3)
Collecting findspark==1.3.0
Downloading findspark-1.3.0-py2.py3-none-any.whl (3.0 kB)
Collecting idna==2.8
Downloading idna-2.8-py2.py3-none-any.whl (58 kB)
Collecting ipykernel==5.1.1
Downloading ipykernel-5.1.1-py3-none-any.whl (114 kB)
Collecting ipython==7.5.0
Downloading ipython-7.5.0-py3-none-any.whl (770 kB)
Requirement already satisfied: ipython-genutils==0.2.0 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 16)) (0.2.0)
Collecting ipywidgets==7.5.0
Downloading ipywidgets-7.5.0-py2.py3-none-any.whl (121 kB)
Collecting jedi==0.14.0
Downloading jedi-0.14.0-py2.py3-none-any.whl (1.0 MB)
Collecting Jinja2==2.10.1
Downloading Jinja2-2.10.1-py2.py3-none-any.whl (124 kB)
Collecting jsonschema==3.0.1
Downloading jsonschema-3.0.1-py2.py3-none-any.whl (54 kB)
Collecting jupyter==1.0.0
Downloading jupyter-1.0.0-py2.py3-none-any.whl (2.7 kB)
Collecting jupyter-client==5.2.4
Downloading jupyter_client-5.2.4-py2.py3-none-any.whl (89 kB)
Collecting jupyter-console==6.0.0
Downloading jupyter_console-6.0.0-py2.py3-none-any.whl (21 kB)
Collecting jupyter-core==4.5.0
Downloading jupyter_core-4.5.0-py2.py3-none-any.whl (78 kB)
Collecting jupyterlab_server==1.0.0
Downloading jupyterlab_server-1.0.0-py3-none-any.whl (26 kB)
Collecting lxml==4.6.2
Downloading lxml-4.6.2-cp37-cp37m-manylinux1_x86_64.whl (5.5 MB)
Collecting MarkupSafe==1.1.1
Downloading MarkupSafe-1.1.1-cp37-cp37m-manylinux2010_x86_64.whl (33 kB)
Collecting mergedeep==1.3.0
Downloading mergedeep-1.3.0-py3-none-any.whl (6.3 kB)
Requirement already satisfied: mistune==0.8.4 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 29)) (0.8.4)
Collecting nbconvert==5.5.0
Downloading nbconvert-5.5.0-py2.py3-none-any.whl (447 kB)
Collecting nbformat==4.4.0
Downloading nbformat-4.4.0-py2.py3-none-any.whl (155 kB)
Requirement already satisfied: notebook in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 33)) (6.4.5)
Collecting nltk==3.5
Downloading nltk-3.5.zip (1.4 MB)
Preparing metadata (setup.py): started
Preparing metadata (setup.py): finished with status 'done'
Collecting nmslib==2.1.1
Downloading nmslib-2.1.1-cp37-cp37m-manylinux2010_x86_64.whl (13.5 MB)
Collecting numpy==1.19.0
Downloading numpy-1.19.0-cp37-cp37m-manylinux2010_x86_64.whl (14.6 MB)
Collecting pandocfilters==1.4.2
Downloading pandocfilters-1.4.2.tar.gz (14 kB)
Preparing metadata (setup.py): started
Preparing metadata (setup.py): finished with status 'done'
Collecting parso==0.5.0
Downloading parso-0.5.0-py2.py3-none-any.whl (94 kB)
Collecting pexpect==4.7.0
Downloading pexpect-4.7.0-py2.py3-none-any.whl (58 kB)
Requirement already satisfied: pickleshare==0.7.5 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 41)) (0.7.5)
Collecting plotly==4.14.3
Downloading plotly-4.14.3-py2.py3-none-any.whl (13.2 MB)
Collecting plotnine==0.7.1
Downloading plotnine-0.7.1-py3-none-any.whl (4.4 MB)
Collecting prometheus-client==0.7.1
Downloading prometheus_client-0.7.1.tar.gz (38 kB)
Preparing metadata (setup.py): started
Preparing metadata (setup.py): finished with status 'done'
Collecting prompt-toolkit==2.0.9
Downloading prompt_toolkit-2.0.9-py3-none-any.whl (337 kB)
Collecting ptyprocess==0.6.0
Downloading ptyprocess-0.6.0-py2.py3-none-any.whl (39 kB)
Collecting py4j==0.10.7
Downloading py4j-0.10.7-py2.py3-none-any.whl (197 kB)
Collecting Pygments==2.4.2
Downloading Pygments-2.4.2-py2.py3-none-any.whl (883 kB)
Collecting pyrsistent==0.15.2
Downloading pyrsistent-0.15.2.tar.gz (106 kB)
Preparing metadata (setup.py): started
Preparing metadata (setup.py): finished with status 'done'
Collecting pysolr==3.8.1
Downloading pysolr-3.8.1-py2.py3-none-any.whl (16 kB)
Collecting python-dateutil==2.8.0
Downloading python_dateutil-2.8.0-py2.py3-none-any.whl (226 kB)
Collecting pytz==2019.1
Downloading pytz-2019.1-py2.py3-none-any.whl (510 kB)
Collecting pyzmq==18.0.1
Downloading pyzmq-18.0.1-cp37-cp37m-manylinux1_x86_64.whl (1.1 MB)
Collecting qtconsole==4.5.1
Downloading qtconsole-4.5.1-py2.py3-none-any.whl (118 kB)
Collecting requests==2.22.0
Downloading requests-2.22.0-py2.py3-none-any.whl (57 kB)
Collecting retrying==1.3.3
Downloading retrying-1.3.3.tar.gz (10 kB)
Preparing metadata (setup.py): started
Preparing metadata (setup.py): finished with status 'done'
Collecting Send2Trash==1.5.0
Downloading Send2Trash-1.5.0-py3-none-any.whl (12 kB)
Collecting six==1.12.0
Downloading six-1.12.0-py2.py3-none-any.whl (10 kB)
Collecting spacy==2.3.0
Downloading spacy-2.3.0-cp37-cp37m-manylinux1_x86_64.whl (10.0 MB)
Collecting transformers==4.5.1
Downloading transformers-4.5.1-py3-none-any.whl (2.1 MB)
Collecting sentence-transformers==1.1.0
Downloading sentence-transformers-1.1.0.tar.gz (78 kB)
Preparing metadata (setup.py): started
Preparing metadata (setup.py): finished with status 'done'
Collecting testpath==0.4.2
Downloading testpath-0.4.2-py2.py3-none-any.whl (163 kB)
Collecting tornado==6.0.3
Downloading tornado-6.0.3.tar.gz (482 kB)
Preparing metadata (setup.py): started
Preparing metadata (setup.py): finished with status 'done'
Collecting traitlets==4.3.2
Downloading traitlets-4.3.2-py2.py3-none-any.whl (74 kB)
Collecting urllib3==1.25.4
Downloading urllib3-1.25.4-py2.py3-none-any.whl (125 kB)
Collecting wcwidth==0.1.7
Downloading wcwidth-0.1.7-py2.py3-none-any.whl (21 kB)
Requirement already satisfied: webencodings==0.5.1 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 69)) (0.5.1)
Collecting widgetsnbextension==3.5.0
Downloading widgetsnbextension-3.5.0-py2.py3-none-any.whl (2.2 MB)
Requirement already satisfied: beautifulsoup4 in /opt/conda/lib/python3.7/site-packages (from bs4==0.0.1->-r requirements.txt (line 5)) (4.10.0)
Requirement already satisfied: setuptools>=18.5 in /opt/conda/lib/python3.7/site-packages (from ipython==7.5.0->-r requirements.txt (line 15)) (60.0.4)
Requirement already satisfied: json5 in /opt/conda/lib/python3.7/site-packages (from jupyterlab_server==1.0.0->-r requirements.txt (line 25)) (0.9.5)
Requirement already satisfied: click in /opt/conda/lib/python3.7/site-packages (from nltk==3.5->-r requirements.txt (line 35)) (8.0.3)
Requirement already satisfied: joblib in /opt/conda/lib/python3.7/site-packages (from nltk==3.5->-r requirements.txt (line 35)) (1.1.0)
Collecting regex
Downloading regex-2021.11.10-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (749 kB)
Requirement already satisfied: tqdm in /opt/conda/lib/python3.7/site-packages (from nltk==3.5->-r requirements.txt (line 35)) (4.62.3)
Requirement already satisfied: psutil in /opt/conda/lib/python3.7/site-packages (from nmslib==2.1.1->-r requirements.txt (line 36)) (5.8.0)
Collecting pybind11<2.6.2
Downloading pybind11-2.6.1-py2.py3-none-any.whl (188 kB)
Requirement already satisfied: scipy>=1.2.0 in /opt/conda/lib/python3.7/site-packages (from plotnine==0.7.1->-r requirements.txt (line 43)) (1.7.3)
Requirement already satisfied: patsy>=0.5.1 in /opt/conda/lib/python3.7/site-packages (from plotnine==0.7.1->-r requirements.txt (line 43)) (0.5.2)
Requirement already satisfied: statsmodels>=0.11.1 in /opt/conda/lib/python3.7/site-packages (from plotnine==0.7.1->-r requirements.txt (line 43)) (0.13.1)
Collecting mizani>=0.7.1
Downloading mizani-0.7.3-py3-none-any.whl (63 kB)
Requirement already satisfied: matplotlib>=3.1.1 in /opt/conda/lib/python3.7/site-packages (from plotnine==0.7.1->-r requirements.txt (line 43)) (3.5.1)
Requirement already satisfied: pandas>=1.1.0 in /opt/conda/lib/python3.7/site-packages (from plotnine==0.7.1->-r requirements.txt (line 43)) (1.3.5)
Collecting descartes>=1.1.0
Downloading descartes-1.1.0-py3-none-any.whl (5.8 kB)
Collecting preshed<3.1.0,>=3.0.2
Downloading preshed-3.0.6-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (125 kB)
Collecting thinc==7.4.1
Downloading thinc-7.4.1-cp37-cp37m-manylinux1_x86_64.whl (2.1 MB)
Collecting blis<0.5.0,>=0.4.0
Downloading blis-0.4.1-cp37-cp37m-manylinux1_x86_64.whl (3.7 MB)
Collecting catalogue<1.1.0,>=0.0.7
Downloading catalogue-1.0.0-py2.py3-none-any.whl (7.7 kB)
Collecting srsly<1.1.0,>=1.0.2
Downloading srsly-1.0.5-cp37-cp37m-manylinux2014_x86_64.whl (184 kB)
Collecting cymem<2.1.0,>=2.0.2
Downloading cymem-2.0.6-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (35 kB)
Collecting wasabi<1.1.0,>=0.4.0
Downloading wasabi-0.9.0-py3-none-any.whl (25 kB)
Collecting murmurhash<1.1.0,>=0.28.0
Downloading murmurhash-1.0.6-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (21 kB)
Collecting plac<1.2.0,>=0.9.6
Downloading plac-1.1.3-py2.py3-none-any.whl (20 kB)
Collecting tokenizers<0.11,>=0.10.1
Downloading tokenizers-0.10.3-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (3.3 MB)
Requirement already satisfied: importlib-metadata in /opt/conda/lib/python3.7/site-packages (from transformers==4.5.1->-r requirements.txt (line 60)) (4.10.0)
Collecting sacremoses
Downloading sacremoses-0.0.46-py3-none-any.whl (895 kB)
Requirement already satisfied: packaging in /opt/conda/lib/python3.7/site-packages (from transformers==4.5.1->-r requirements.txt (line 60)) (21.2)
Collecting filelock
Downloading filelock-3.4.0-py3-none-any.whl (9.8 kB)
Collecting torch>=1.6.0
Downloading torch-1.10.1-cp37-cp37m-manylinux1_x86_64.whl (881.9 MB)
ERROR: Service 'notebooks' failed to build : The command '/bin/bash -o pipefail -c python -m pip install --upgrade pip && pip install -r requirements.txt' returned a non-zero code: 137
try if Spark 3.4.0 image will work
It's a known issue caused by the new versions & related to the holzschu/Carnets#310 - after pinning dependencies it works.
Hello,
First of all, this is an awesome book. Currently, I'm working with it day and night. :D
right now I'm in Chapter 5, but since the last commit 2.index-datasets.ipynb is broken. In commit 7cbf560aa9b4fb12a6b453b819e716774984e397
it works but after this not anymore.
I get the following error if I try to open it:
NotJSONError("Notebook does not appear to be JSON: '\\n\\n\\n\\n\\n\\n\\n<!DOCTYPE html>\\n<html la...")
Should see if we can clean this up before publication:
aips-data-science | [I 2023-12-03 16:44:07.224 ServerApp] Generating new user for token-authenticated request: d61e0fc3fb7e48b383d772bfa33204da
aips-data-science | [I 2023-12-03 16:44:12.296 ServerApp] Generating new user for token-authenticated request: 36893885b19344c28acefaffb4aa2471
aips-data-science | [I 2023-12-03 16:44:17.363 ServerApp] Generating new user for token-authenticated request: 4ff7daa059a24581abd581bef7814b32
aips-data-science | [I 2023-12-03 16:44:22.419 ServerApp] Generating new user for token-authenticated request: df5178e8735246f4bd8a0331248208e0
aips-data-science | [I 2023-12-03 16:44:27.476 ServerApp] Generating new user for token-authenticated request: bad3506047074bf898b8e747a16c19d5
aips-data-science | [I 2023-12-03 16:44:32.527 ServerApp] Generating new user for token-authenticated request: d045277bf54d426cb4c2d4aaba3f7304
aips-data-science | [I 2023-12-03 16:44:37.594 ServerApp] Generating new user for token-authenticated request: b04cc852e49246dd8ac9f7e022493374
aips-data-science | [I 2023-12-03 16:44:42.682 ServerApp] Generating new user for token-authenticated request: 807f25cf442e4754994fa9a6e0d0f9d2
aips-data-science | [I 2023-12-03 16:44:47.755 ServerApp] Generating new user for token-authenticated request: 88a9aee0871642169b06eea8aa001b91
aips-data-science | [I 2023-12-03 16:44:52.818 ServerApp] Generating new user for token-authenticated request: fa90fdf67f314784846439465a05ac6b
Right now, only chapters 1-5 & 10 are listed in the welcome notebook. We need to list each chapter & provide links to specific notebooks
stacktrace:
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[29], line 30
28 train, test = test_train_split(sdbn, train=0.8)
29 ranksvm_ltr(train, model_name='exploit', feature_set=exploit_feature_set)
---> 30 eval_model(test, model_name='exploit', sdbn=new_sdbn)
32 # ===============
33 # EXPLORE
35 explore_feature_set = [
36 {
37 "name" : "manufacturer_match",
(...)
66 }
67 }]
NameError: name 'new_sdbn' is not defined
The cell 27 fails with following stacktrace:
KeyError Traceback (most recent call last)
Cell In[27], line 5
2 purchases = {'test1': 0, 'test3': 0}
3 for _ in range(0, NUM_USERS):
----> 5 model_name, purchase_made = a_or_b_model(query='transformers dvd',
6 a_model='test1',
7 b_model='test3')
8 if purchase_made:
9 purchases[model_name]+= 1
Cell In[17], line 20, in a_or_b_model(query, a_model, b_model)
17 else:
18 model_name=b_model
---> 20 purchase_made = live_user_query(query=query,
21 model_name=model_name,
22 desired=wants_to_purchase,
23 meh=might_purchase)
24 return (model_name, purchase_made)
Cell In[16], line 15, in live_user_query(query, model_name, desired, meh, desired_prob, meh_prob, uninteresting_prob, quit_per_rank_prob)
1 def live_user_query(query, model_name,
2 desired, meh,
3 desired_prob=0.15,
4 meh_prob=0.03,
5 uninteresting_prob=0.01,
6 quit_per_rank_prob=0.2):
7 """Live user for 'query' where purchase probability depends on if
8 products upc is in one of three sets.
9
(...)
13
14 """
---> 15 search_results = search(query, model_name, at=10)
17 results = pd.DataFrame(search_results).reset_index()
18 for doc in results.to_dict(orient="records"):
Cell In[11], line 32, in search(query, model_name, at, log)
29 if log:
30 print(resp)
---> 32 search_results = resp['response']['docs']
34 for rank, result in enumerate(search_results):
35 result['rank'] = rank
KeyError: 'response'
Howdy, enjoying the book! Can you add the Retrotech search sessions dataset? This would be to play around with the data referenced in chapter 11 regarding SDBM's.
or perhaps to the spark-3-support branch (to support the working pip installable branch, per #46)
Thank you!
Delete dynamic field
Status: Failure; Response:[ {'responseHeader': {'status': 400, 'QTime': 89}, 'error': {'metadata': ['error-class', 'org.apache.solr.api.ApiBag$ExceptionWithErrObject', 'root-error-class', 'org.apache.solr.api.ApiBag$ExceptionWithErrObject'], 'details': [{'delete-dynamic-field': {'name': '*_ngram'}, 'errorMessages': ["The dynamic field '*_ngram' is not present in this schema, and so cannot be deleted.\n"]}], 'msg': 'error processing commands', 'code': 400}} ]
Hi there,
I have started enjoying the book but I am not able to start the containers ๐
After changing spacy version from 2.3.1 to 2.3.7 (to fix a very first error, No matching distribution found for spacy==2.3.0), I end up receiving an error related to a missing rust compiler during tokenizers installation.
Here's my environment:
macos big sur v11.4 Apple chip M1
docker 20.10.8
pip 21.3.1
I am on master, up-to-date.
Can you please have a look ?
Many thanks.
Steve
Full log below.
zookeeper uses an image, skipping
Building solr
[+] Building 0.4s (9/9) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 37B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/solr:8.5.2 0.2s
=> [internal] load build context 0.0s
=> => transferring context: 39B 0.0s
=> [1/4] FROM docker.io/library/solr:8.5.2@sha256:dd56c541fb28a60e241550a3eb63afde0d8890a1ffe3971399fd245a22d071be 0.0s
=> CACHED [2/4] ADD run_solr_w_ltr.sh ./run_solr_w_ltr.sh 0.0s
=> CACHED [3/4] RUN chown solr:solr run_solr_w_ltr.sh 0.0s
=> CACHED [4/4] RUN chmod u+x run_solr_w_ltr.sh 0.0s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:3348d34a2841ca9d73e6db70d4d22aa998fcfddd9cf16c2b6baedd6754d119d0 0.0s
=> => naming to docker.io/library/docker_solr 0.0s
[15/18] RUN python -m pip install --upgrade pip && pip install -r requirements.txt:
#19 0.438 Defaulting to user installation because normal site-packages is not writeable
#19 0.455 Requirement already satisfied: pip in /opt/conda/lib/python3.7/site-packages (21.3.1)
#19 1.315 Defaulting to user installation because normal site-packages is not writeable
#19 1.430 Collecting appnope==0.1.0
#19 1.513 Downloading appnope-0.1.0-py2.py3-none-any.whl (4.0 kB)
#19 1.545 Collecting attrs==19.1.0
#19 1.564 Downloading attrs-19.1.0-py2.py3-none-any.whl (35 kB)
#19 1.590 Collecting backcall==0.1.0
#19 1.607 Downloading backcall-0.1.0.zip (11 kB)
#19 1.612 Preparing metadata (setup.py): started
#19 1.763 Preparing metadata (setup.py): finished with status 'done'
#19 1.805 Collecting bleach==3.1.4
#19 1.821 Downloading bleach-3.1.4-py2.py3-none-any.whl (151 kB)
#19 1.870 Collecting bs4==0.0.1
#19 1.887 Downloading bs4-0.0.1.tar.gz (1.1 kB)
#19 1.892 Preparing metadata (setup.py): started
#19 2.040 Preparing metadata (setup.py): finished with status 'done'
#19 2.071 Collecting certifi==2019.6.16
#19 2.089 Downloading certifi-2019.6.16-py2.py3-none-any.whl (157 kB)
#19 2.125 Collecting chardet==3.0.4
#19 2.149 Downloading chardet-3.0.4-py2.py3-none-any.whl (133 kB)
#19 2.429 Collecting cython==0.29.20
#19 2.449 Downloading Cython-0.29.20-py2.py3-none-any.whl (973 kB)
#19 2.526 Collecting decorator==4.4.0
#19 2.545 Downloading decorator-4.4.0-py2.py3-none-any.whl (8.3 kB)
#19 2.571 Collecting defusedxml==0.6.0
#19 2.590 Downloading defusedxml-0.6.0-py2.py3-none-any.whl (23 kB)
#19 2.594 Requirement already satisfied: entrypoints==0.3 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 11)) (0.3)
#19 2.615 Collecting findspark==1.3.0
#19 2.632 Downloading findspark-1.3.0-py2.py3-none-any.whl (3.0 kB)
#19 2.659 Collecting idna==2.8
#19 2.675 Downloading idna-2.8-py2.py3-none-any.whl (58 kB)
#19 2.736 Collecting ipykernel==5.1.1
#19 2.753 Downloading ipykernel-5.1.1-py3-none-any.whl (114 kB)
#19 2.819 Collecting ipython==7.5.0
#19 2.846 Downloading ipython-7.5.0-py3-none-any.whl (770 kB)
#19 2.865 Requirement already satisfied: ipython-genutils==0.2.0 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 16)) (0.2.0)
#19 2.910 Collecting ipywidgets==7.5.0
#19 2.926 Downloading ipywidgets-7.5.0-py2.py3-none-any.whl (121 kB)
#19 2.964 Collecting jedi==0.14.0
#19 2.985 Downloading jedi-0.14.0-py2.py3-none-any.whl (1.0 MB)
#19 3.049 Collecting Jinja2==2.10.1
#19 3.086 Downloading Jinja2-2.10.1-py2.py3-none-any.whl (124 kB)
#19 3.122 Collecting jsonschema==3.0.1
#19 3.142 Downloading jsonschema-3.0.1-py2.py3-none-any.whl (54 kB)
#19 3.167 Collecting jupyter==1.0.0
#19 3.184 Downloading jupyter-1.0.0-py2.py3-none-any.whl (2.7 kB)
#19 3.224 Collecting jupyter-client==5.2.4
#19 3.244 Downloading jupyter_client-5.2.4-py2.py3-none-any.whl (89 kB)
#19 3.272 Collecting jupyter-console==6.0.0
#19 3.289 Downloading jupyter_console-6.0.0-py2.py3-none-any.whl (21 kB)
#19 3.339 Collecting jupyter-core==4.5.0
#19 3.355 Downloading jupyter_core-4.5.0-py2.py3-none-any.whl (78 kB)
#19 3.408 Collecting jupyterlab_server==1.0.0
#19 3.426 Downloading jupyterlab_server-1.0.0-py3-none-any.whl (26 kB)
#19 3.641 Collecting lxml==4.6.2
#19 3.657 Downloading lxml-4.6.2-cp37-cp37m-manylinux2014_aarch64.whl (6.7 MB)
#19 3.872 Collecting MarkupSafe==1.1.1
#19 3.888 Downloading MarkupSafe-1.1.1-cp37-cp37m-manylinux2014_aarch64.whl (34 kB)
#19 3.914 Collecting mergedeep==1.3.0
#19 3.929 Downloading mergedeep-1.3.0-py3-none-any.whl (6.3 kB)
#19 3.935 Requirement already satisfied: mistune==0.8.4 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 29)) (0.8.4)
#19 3.970 Collecting nbconvert==5.5.0
#19 3.991 Downloading nbconvert-5.5.0-py2.py3-none-any.whl (447 kB)
#19 4.027 Collecting nbformat==4.4.0
#19 4.060 Downloading nbformat-4.4.0-py2.py3-none-any.whl (155 kB)
#19 4.108 Collecting notebook==5.7.8
#19 4.139 Downloading notebook-5.7.8-py2.py3-none-any.whl (9.0 MB)
#19 4.420 Collecting nltk==3.5
#19 4.440 Downloading nltk-3.5.zip (1.4 MB)
#19 4.526 Preparing metadata (setup.py): started
#19 4.688 Preparing metadata (setup.py): finished with status 'done'
#19 4.746 Collecting nmslib==2.1.1
#19 4.764 Downloading nmslib-2.1.1-cp37-cp37m-manylinux2014_aarch64.whl (14.0 MB)
#19 5.372 Collecting numpy==1.19.0
#19 5.391 Downloading numpy-1.19.0-cp37-cp37m-manylinux2014_aarch64.whl (12.2 MB)
#19 5.660 Collecting pandocfilters==1.4.2
#19 5.676 Downloading pandocfilters-1.4.2.tar.gz (14 kB)
#19 5.685 Preparing metadata (setup.py): started
#19 5.823 Preparing metadata (setup.py): finished with status 'done'
#19 5.855 Collecting parso==0.5.0
#19 5.872 Downloading parso-0.5.0-py2.py3-none-any.whl (94 kB)
#19 5.905 Collecting pexpect==4.7.0
#19 5.922 Downloading pexpect-4.7.0-py2.py3-none-any.whl (58 kB)
#19 5.930 Requirement already satisfied: pickleshare==0.7.5 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 40)) (0.7.5)
#19 5.991 Collecting plotly==4.14.3
#19 6.009 Downloading plotly-4.14.3-py2.py3-none-any.whl (13.2 MB)
#19 6.403 Collecting plotnine==0.7.1
#19 6.428 Downloading plotnine-0.7.1-py3-none-any.whl (4.4 MB)
#19 6.558 Collecting prometheus-client==0.7.1
#19 6.575 Downloading prometheus_client-0.7.1.tar.gz (38 kB)
#19 6.586 Preparing metadata (setup.py): started
#19 6.734 Preparing metadata (setup.py): finished with status 'done'
#19 6.799 Collecting prompt-toolkit==2.0.9
#19 6.817 Downloading prompt_toolkit-2.0.9-py3-none-any.whl (337 kB)
#19 6.846 Collecting ptyprocess==0.6.0
#19 6.864 Downloading ptyprocess-0.6.0-py2.py3-none-any.whl (39 kB)
#19 6.896 Collecting py4j==0.10.7
#19 6.913 Downloading py4j-0.10.7-py2.py3-none-any.whl (197 kB)
#19 6.957 Collecting Pygments==2.4.2
#19 6.978 Downloading Pygments-2.4.2-py2.py3-none-any.whl (883 kB)
#19 7.027 Collecting pyrsistent==0.15.2
#19 7.044 Downloading pyrsistent-0.15.2.tar.gz (106 kB)
#19 7.061 Preparing metadata (setup.py): started
#19 7.208 Preparing metadata (setup.py): finished with status 'done'
#19 7.236 Collecting pysolr==3.8.1
#19 7.251 Downloading pysolr-3.8.1-py2.py3-none-any.whl (16 kB)
#19 7.284 Collecting python-dateutil==2.8.0
#19 7.306 Downloading python_dateutil-2.8.0-py2.py3-none-any.whl (226 kB)
#19 7.383 Collecting pytz==2019.1
#19 7.401 Downloading pytz-2019.1-py2.py3-none-any.whl (510 kB)
#19 7.573 Collecting pyzmq==18.0.1
#19 7.596 Downloading pyzmq-18.0.1.tar.gz (1.2 MB)
#19 7.749 Preparing metadata (setup.py): started
#19 7.942 Preparing metadata (setup.py): finished with status 'done'
#19 7.977 Collecting qtconsole==4.5.1
#19 7.994 Downloading qtconsole-4.5.1-py2.py3-none-any.whl (118 kB)
#19 8.057 Collecting requests==2.22.0
#19 8.082 Downloading requests-2.22.0-py2.py3-none-any.whl (57 kB)
#19 8.106 Collecting retrying==1.3.3
#19 8.122 Downloading retrying-1.3.3.tar.gz (10 kB)
#19 8.129 Preparing metadata (setup.py): started
#19 8.272 Preparing metadata (setup.py): finished with status 'done'
#19 8.298 Collecting Send2Trash==1.5.0
#19 8.315 Downloading Send2Trash-1.5.0-py3-none-any.whl (12 kB)
#19 8.342 Collecting six==1.12.0
#19 8.363 Downloading six-1.12.0-py2.py3-none-any.whl (10 kB)
#19 8.487 Collecting spacy==2.3.7
#19 8.504 Downloading spacy-2.3.7.tar.gz (5.8 MB)
#19 9.188 Installing build dependencies: started
#19 110.1 Installing build dependencies: still running...
#19 178.1 Installing build dependencies: still running...
#19 184.5 Installing build dependencies: finished with status 'done'
#19 184.5 Getting requirements to build wheel: started
#19 184.7 Getting requirements to build wheel: finished with status 'done'
#19 184.8 Installing backend dependencies: started
#19 186.1 Installing backend dependencies: finished with status 'done'
#19 186.1 Preparing metadata (pyproject.toml): started
#19 186.4 Preparing metadata (pyproject.toml): finished with status 'done'
#19 186.4 Collecting transformers==4.5.1
#19 186.5 Downloading transformers-4.5.1-py3-none-any.whl (2.1 MB)
#19 186.5 Collecting sentence-transformers==1.1.0
#19 186.5 Downloading sentence-transformers-1.1.0.tar.gz (78 kB)
#19 186.6 Preparing metadata (setup.py): started
#19 186.7 Preparing metadata (setup.py): finished with status 'done'
#19 186.7 Collecting testpath==0.4.2
#19 186.8 Downloading testpath-0.4.2-py2.py3-none-any.whl (163 kB)
#19 186.8 Collecting tornado==6.0.3
#19 186.8 Downloading tornado-6.0.3.tar.gz (482 kB)
#19 186.9 Preparing metadata (setup.py): started
#19 187.0 Preparing metadata (setup.py): finished with status 'done'
#19 187.1 Collecting traitlets==4.3.2
#19 187.1 Downloading traitlets-4.3.2-py2.py3-none-any.whl (74 kB)
#19 187.2 Collecting urllib3==1.25.4
#19 187.2 Downloading urllib3-1.25.4-py2.py3-none-any.whl (125 kB)
#19 187.2 Collecting wcwidth==0.1.7
#19 187.2 Downloading wcwidth-0.1.7-py2.py3-none-any.whl (21 kB)
#19 187.2 Requirement already satisfied: webencodings==0.5.1 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 68)) (0.5.1)
#19 187.3 Collecting widgetsnbextension==3.5.0
#19 187.4 Downloading widgetsnbextension-3.5.0-py2.py3-none-any.whl (2.2 MB)
#19 187.5 Requirement already satisfied: beautifulsoup4 in /opt/conda/lib/python3.7/site-packages (from bs4==0.0.1->-r requirements.txt (line 5)) (4.10.0)
#19 187.5 Requirement already satisfied: setuptools>=18.5 in /opt/conda/lib/python3.7/site-packages (from ipython==7.5.0->-r requirements.txt (line 15)) (60.5.0)
#19 187.6 Requirement already satisfied: json5 in /opt/conda/lib/python3.7/site-packages (from jupyterlab_server==1.0.0->-r requirements.txt (line 25)) (0.9.5)
#19 187.7 Requirement already satisfied: terminado>=0.8.1 in /opt/conda/lib/python3.7/site-packages (from notebook==5.7.8->-r requirements.txt (line 32)) (0.12.1)
#19 187.7 Requirement already satisfied: click in /opt/conda/lib/python3.7/site-packages (from nltk==3.5->-r requirements.txt (line 34)) (8.0.3)
#19 187.7 Requirement already satisfied: joblib in /opt/conda/lib/python3.7/site-packages (from nltk==3.5->-r requirements.txt (line 34)) (1.1.0)
#19 188.1 Collecting regex
#19 188.1 Downloading regex-2021.11.10-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (746 kB)
#19 188.2 Requirement already satisfied: tqdm in /opt/conda/lib/python3.7/site-packages (from nltk==3.5->-r requirements.txt (line 34)) (4.62.3)
#19 188.2 Requirement already satisfied: psutil in /opt/conda/lib/python3.7/site-packages (from nmslib==2.1.1->-r requirements.txt (line 35)) (5.9.0)
#19 188.2 Collecting pybind11<2.6.2
#19 188.2 Downloading pybind11-2.6.1-py2.py3-none-any.whl (188 kB)
#19 188.3 Requirement already satisfied: statsmodels>=0.11.1 in /opt/conda/lib/python3.7/site-packages (from plotnine==0.7.1->-r requirements.txt (line 42)) (0.13.1)
#19 188.3 Collecting descartes>=1.1.0
#19 188.3 Downloading descartes-1.1.0-py3-none-any.whl (5.8 kB)
#19 188.3 Requirement already satisfied: patsy>=0.5.1 in /opt/conda/lib/python3.7/site-packages (from plotnine==0.7.1->-r requirements.txt (line 42)) (0.5.2)
#19 188.3 Requirement already satisfied: matplotlib>=3.1.1 in /opt/conda/lib/python3.7/site-packages (from plotnine==0.7.1->-r requirements.txt (line 42)) (3.5.1)
#19 188.3 Requirement already satisfied: pandas>=1.1.0 in /opt/conda/lib/python3.7/site-packages (from plotnine==0.7.1->-r requirements.txt (line 42)) (1.3.5)
#19 188.4 Collecting mizani>=0.7.1
#19 188.4 Downloading mizani-0.7.3-py3-none-any.whl (63 kB)
#19 188.4 Requirement already satisfied: scipy>=1.2.0 in /opt/conda/lib/python3.7/site-packages (from plotnine==0.7.1->-r requirements.txt (line 42)) (1.7.3)
#19 188.5 Collecting wasabi<1.1.0,>=0.4.0
#19 188.5 Using cached wasabi-0.9.0-py3-none-any.whl (25 kB)
#19 188.6 Collecting cymem<2.1.0,>=2.0.2
#19 188.6 Using cached cymem-2.0.6-cp37-cp37m-linux_aarch64.whl
#19 188.6 Collecting plac<1.2.0,>=0.9.6
#19 188.6 Using cached plac-1.1.3-py2.py3-none-any.whl (20 kB)
#19 188.6 Collecting preshed<3.1.0,>=3.0.2
#19 188.6 Using cached preshed-3.0.6-cp37-cp37m-linux_aarch64.whl
#19 188.7 Collecting murmurhash<1.1.0,>=0.28.0
#19 188.7 Using cached murmurhash-1.0.6-cp37-cp37m-linux_aarch64.whl
#19 188.8 Collecting thinc<7.5.0,>=7.4.1
#19 188.8 Using cached thinc-7.4.5-cp37-cp37m-linux_aarch64.whl
#19 188.9 Collecting blis<0.8.0,>=0.4.0
#19 188.9 Using cached blis-0.7.5-cp37-cp37m-linux_aarch64.whl
#19 188.9 Collecting catalogue<1.1.0,>=0.0.7
#19 188.9 Using cached catalogue-1.0.0-py2.py3-none-any.whl (7.7 kB)
#19 189.0 Collecting srsly<1.1.0,>=1.0.2
#19 189.0 Using cached srsly-1.0.5-cp37-cp37m-linux_aarch64.whl
#19 189.1 Collecting filelock
#19 189.1 Downloading filelock-3.4.2-py3-none-any.whl (9.9 kB)
#19 189.2 Collecting sacremoses
#19 189.2 Downloading sacremoses-0.0.47-py2.py3-none-any.whl (895 kB)
#19 189.2 Requirement already satisfied: importlib-metadata in /opt/conda/lib/python3.7/site-packages (from transformers==4.5.1->-r requirements.txt (line 59)) (4.10.0)
#19 189.4 Collecting tokenizers<0.11,>=0.10.1
#19 189.4 Downloading tokenizers-0.10.3.tar.gz (212 kB)
#19 189.5 Installing build dependencies: started
#19 191.5 Installing build dependencies: finished with status 'done'
#19 191.5 Getting requirements to build wheel: started
#19 191.7 Getting requirements to build wheel: finished with status 'done'
#19 191.7 Preparing metadata (pyproject.toml): started
#19 191.8 Preparing metadata (pyproject.toml): finished with status 'done'
#19 191.8 Requirement already satisfied: packaging in /opt/conda/lib/python3.7/site-packages (from transformers==4.5.1->-r requirements.txt (line 59)) (21.2)
#19 191.9 Collecting torch>=1.6.0
#19 192.2 Downloading torch-1.10.1-cp37-cp37m-manylinux2014_aarch64.whl (51.0 MB)
#19 193.2 Requirement already satisfied: scikit-learn in /opt/conda/lib/python3.7/site-packages (from sentence-transformers==1.1.0->-r requirements.txt (line 60)) (1.0.2)
#19 193.3 Collecting sentencepiece
#19 193.3 Downloading sentencepiece-0.1.96-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.2 MB)
#19 193.4 Requirement already satisfied: zipp>=0.5 in /opt/conda/lib/python3.7/site-packages (from importlib-metadata->transformers==4.5.1->-r requirements.txt (line 59)) (3.6.0)
#19 193.4 Requirement already satisfied: typing-extensions>=3.6.4 in /opt/conda/lib/python3.7/site-packages (from importlib-metadata->transformers==4.5.1->-r requirements.txt (line 59)) (3.10.0.2)
#19 193.4 Requirement already satisfied: pyparsing>=2.2.1 in /opt/conda/lib/python3.7/site-packages (from matplotlib>=3.1.1->plotnine==0.7.1->-r requirements.txt (line 42)) (2.4.7)
#19 193.4 Requirement already satisfied: fonttools>=4.22.0 in /opt/conda/lib/python3.7/site-packages (from matplotlib>=3.1.1->plotnine==0.7.1->-r requirements.txt (line 42)) (4.28.5)
#19 193.4 Requirement already satisfied: cycler>=0.10 in /opt/conda/lib/python3.7/site-packages (from matplotlib>=3.1.1->plotnine==0.7.1->-r requirements.txt (line 42)) (0.11.0)
#19 193.4 Requirement already satisfied: kiwisolver>=1.0.1 in /opt/conda/lib/python3.7/site-packages (from matplotlib>=3.1.1->plotnine==0.7.1->-r requirements.txt (line 42)) (1.3.2)
#19 193.4 Requirement already satisfied: pillow>=6.2.0 in /opt/conda/lib/python3.7/site-packages (from matplotlib>=3.1.1->plotnine==0.7.1->-r requirements.txt (line 42)) (8.4.0)
#19 193.5 Collecting palettable
#19 193.5 Downloading palettable-3.3.0-py2.py3-none-any.whl (111 kB)
#19 193.7 Collecting pandas>=1.1.0
#19 193.8 Downloading pandas-1.3.4-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (10.7 MB)
#19 194.0 Downloading pandas-1.3.3-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (10.7 MB)
#19 194.4 Requirement already satisfied: soupsieve>1.2 in /opt/conda/lib/python3.7/site-packages (from beautifulsoup4->bs4==0.0.1->-r requirements.txt (line 5)) (2.0.1)
#19 194.5 Requirement already satisfied: threadpoolctl>=2.0.0 in /opt/conda/lib/python3.7/site-packages (from scikit-learn->sentence-transformers==1.1.0->-r requirements.txt (line 60)) (3.0.0)
#19 194.6 Building wheels for collected packages: backcall, bs4, nltk, pandocfilters, prometheus-client, pyrsistent, pyzmq, retrying, spacy, sentence-transformers, tornado, tokenizers
#19 194.6 Building wheel for backcall (setup.py): started
#19 194.8 Building wheel for backcall (setup.py): finished with status 'done'
#19 194.8 Created wheel for backcall: filename=backcall-0.1.0-py3-none-any.whl size=10413 sha256=d446198efc54c473b28e330cf08a9b80d4e0e5ba7685678634b2a2eefe3fc48b
#19 194.8 Stored in directory: /home/jovyan/.cache/pip/wheels/60/5a/10/2177abb11261d49069a732cbc0e66207783c7ee79c1f807167
#19 194.8 Building wheel for bs4 (setup.py): started
#19 195.0 Building wheel for bs4 (setup.py): finished with status 'done'
#19 195.0 Created wheel for bs4: filename=bs4-0.0.1-py3-none-any.whl size=1271 sha256=434cd8a88de0a62283a52bf90a8e0e970476bdf89513753ca9d7bd9ce583f88d
#19 195.0 Stored in directory: /home/jovyan/.cache/pip/wheels/0a/9e/ba/20e5bbc1afef3a491f0b3bb74d508f99403aabe76eda2167ca
#19 195.0 Building wheel for nltk (setup.py): started
#19 195.5 Building wheel for nltk (setup.py): finished with status 'done'
#19 195.5 Created wheel for nltk: filename=nltk-3.5-py3-none-any.whl size=1434693 sha256=320180c519a60c785e771f927f07268893476bb211e23fa24fa3c39ce330815c
#19 195.5 Stored in directory: /home/jovyan/.cache/pip/wheels/45/6c/46/a1865e7ba706b3817f5d1b2ff7ce8996aabdd0d03d47ba0266
#19 195.5 Building wheel for pandocfilters (setup.py): started
#19 195.7 Building wheel for pandocfilters (setup.py): finished with status 'done'
#19 195.7 Created wheel for pandocfilters: filename=pandocfilters-1.4.2-py3-none-any.whl size=7871 sha256=b5280b820859d0124baf03fe2d0df4d132d5a1d21adaa7ee7e468e429b8c8b9b
#19 195.7 Stored in directory: /home/jovyan/.cache/pip/wheels/63/99/01/9fe785b86d1e091a6b2a61e06ddb3d8eb1bc9acae5933d4740
#19 195.7 Building wheel for prometheus-client (setup.py): started
#19 195.9 Building wheel for prometheus-client (setup.py): finished with status 'done'
#19 195.9 Created wheel for prometheus-client: filename=prometheus_client-0.7.1-py3-none-any.whl size=41404 sha256=3c5dcfe44be00e3cc625bf43c1b82ccab4c70c32396183dc29227ff06c18f73c
#19 195.9 Stored in directory: /home/jovyan/.cache/pip/wheels/30/0c/26/59ba285bf65dc79d195e9b25e2ddde4c61070422729b0cd914
#19 195.9 Building wheel for pyrsistent (setup.py): started
#19 196.6 Building wheel for pyrsistent (setup.py): finished with status 'done'
#19 196.6 Created wheel for pyrsistent: filename=pyrsistent-0.15.2-cp37-cp37m-linux_aarch64.whl size=129753 sha256=9a4ca4be9aef2fbd77f4e2a769f87d67a4805fcc5940a9ab329a059645c95419
#19 196.6 Stored in directory: /home/jovyan/.cache/pip/wheels/bf/84/c2/d54c1edb44cc3248e7a0bc06f76395b7e971eaf52b3f63835b
#19 196.6 Building wheel for pyzmq (setup.py): started
#19 251.2 Building wheel for pyzmq (setup.py): finished with status 'done'
#19 251.2 Created wheel for pyzmq: filename=pyzmq-18.0.1-cp37-cp37m-linux_aarch64.whl size=6365580 sha256=c26cca45da14c3c3ae0654f29a224970de8ae284f66c58fcc7a0860012828d00
#19 251.2 Stored in directory: /home/jovyan/.cache/pip/wheels/41/19/79/c53cb3e5358dff43028ac8afed4f3dcf3ced0239f9fa97984f
#19 251.2 Building wheel for retrying (setup.py): started
#19 251.5 Building wheel for retrying (setup.py): finished with status 'done'
#19 251.5 Created wheel for retrying: filename=retrying-1.3.3-py3-none-any.whl size=11448 sha256=3752c6106eff9307a81f111a601a72d0d548ae679e4d38315d2c6b637c32522e
#19 251.5 Stored in directory: /home/jovyan/.cache/pip/wheels/f9/8d/8d/f6af3f7f9eea3553bc2fe6d53e4b287dad18b06a861ac56ddf
#19 251.5 Building wheel for spacy (pyproject.toml): started
#19 312.5 Building wheel for spacy (pyproject.toml): still running...
#19 373.3 Building wheel for spacy (pyproject.toml): still running...
#19 433.6 Building wheel for spacy (pyproject.toml): still running...
#19 441.3 Building wheel for spacy (pyproject.toml): finished with status 'done'
#19 441.3 Created wheel for spacy: filename=spacy-2.3.7-cp37-cp37m-linux_aarch64.whl size=26652568 sha256=9c3ab5eb8e4723226a3e68cf4943bb779010b0d11f6321938f36bc74cd3c293d
#19 441.3 Stored in directory: /home/jovyan/.cache/pip/wheels/aa/99/63/f57e42849e2e628229458201f2d3e61896ed3cfe2fe0c339e3
#19 441.3 Building wheel for sentence-transformers (setup.py): started
#19 441.6 Building wheel for sentence-transformers (setup.py): finished with status 'done'
#19 441.6 Created wheel for sentence-transformers: filename=sentence_transformers-1.1.0-py3-none-any.whl size=119616 sha256=cb1d60ae82025d648865233c1d6fa94990f677990bb430c68a2f0863ae05ea68
#19 441.6 Stored in directory: /home/jovyan/.cache/pip/wheels/20/fd/72/b2524b6c3af92dae3ce173595aeff673a8114255809a9aa381
#19 441.6 Building wheel for tornado (setup.py): started
#19 441.9 Building wheel for tornado (setup.py): finished with status 'done'
#19 441.9 Created wheel for tornado: filename=tornado-6.0.3-cp37-cp37m-linux_aarch64.whl size=424708 sha256=3af5a2cbe946cbd2be632b4cc9a721868ec22ab5dabb04aeac424c658f8ea6f2
#19 441.9 Stored in directory: /home/jovyan/.cache/pip/wheels/d0/31/2c/9406ed59f0dcdce0c453a8664124d738551590e74fc087f604
#19 442.0 Building wheel for tokenizers (pyproject.toml): started
#19 442.1 Building wheel for tokenizers (pyproject.toml): finished with status 'error'
#19 442.1 ERROR: Command errored out with exit status 1:
#19 442.1 command: /opt/conda/bin/python3.7 /opt/conda/lib/python3.7/site-packages/pip/_vendor/pep517/in_process/_in_process.py build_wheel /tmp/tmptmcnxiko
#19 442.1 cwd: /tmp/pip-install-7dlawjhx/tokenizers_81a2f8296e644e608c00a45a89cdcd07
#19 442.1 Complete output (50 lines):
#19 442.1 running bdist_wheel
#19 442.1 running build
#19 442.1 running build_py
#19 442.1 creating build
#19 442.1 creating build/lib.linux-aarch64-3.7
#19 442.1 creating build/lib.linux-aarch64-3.7/tokenizers
#19 442.1 copying py_src/tokenizers/init.py -> build/lib.linux-aarch64-3.7/tokenizers
#19 442.1 creating build/lib.linux-aarch64-3.7/tokenizers/models
#19 442.1 copying py_src/tokenizers/models/init.py -> build/lib.linux-aarch64-3.7/tokenizers/models
#19 442.1 creating build/lib.linux-aarch64-3.7/tokenizers/decoders
#19 442.1 copying py_src/tokenizers/decoders/init.py -> build/lib.linux-aarch64-3.7/tokenizers/decoders
#19 442.1 creating build/lib.linux-aarch64-3.7/tokenizers/normalizers
#19 442.1 copying py_src/tokenizers/normalizers/init.py -> build/lib.linux-aarch64-3.7/tokenizers/normalizers
#19 442.1 creating build/lib.linux-aarch64-3.7/tokenizers/pre_tokenizers
#19 442.1 copying py_src/tokenizers/pre_tokenizers/init.py -> build/lib.linux-aarch64-3.7/tokenizers/pre_tokenizers
#19 442.1 creating build/lib.linux-aarch64-3.7/tokenizers/processors
#19 442.1 copying py_src/tokenizers/processors/init.py -> build/lib.linux-aarch64-3.7/tokenizers/processors
#19 442.1 creating build/lib.linux-aarch64-3.7/tokenizers/trainers
#19 442.1 copying py_src/tokenizers/trainers/init.py -> build/lib.linux-aarch64-3.7/tokenizers/trainers
#19 442.1 creating build/lib.linux-aarch64-3.7/tokenizers/implementations
#19 442.1 copying py_src/tokenizers/implementations/init.py -> build/lib.linux-aarch64-3.7/tokenizers/implementations
#19 442.1 copying py_src/tokenizers/implementations/byte_level_bpe.py -> build/lib.linux-aarch64-3.7/tokenizers/implementations
#19 442.1 copying py_src/tokenizers/implementations/sentencepiece_bpe.py -> build/lib.linux-aarch64-3.7/tokenizers/implementations
#19 442.1 copying py_src/tokenizers/implementations/base_tokenizer.py -> build/lib.linux-aarch64-3.7/tokenizers/implementations
#19 442.1 copying py_src/tokenizers/implementations/bert_wordpiece.py -> build/lib.linux-aarch64-3.7/tokenizers/implementations
#19 442.1 copying py_src/tokenizers/implementations/char_level_bpe.py -> build/lib.linux-aarch64-3.7/tokenizers/implementations
#19 442.1 copying py_src/tokenizers/implementations/sentencepiece_unigram.py -> build/lib.linux-aarch64-3.7/tokenizers/implementations
#19 442.1 creating build/lib.linux-aarch64-3.7/tokenizers/tools
#19 442.1 copying py_src/tokenizers/tools/init.py -> build/lib.linux-aarch64-3.7/tokenizers/tools
#19 442.1 copying py_src/tokenizers/tools/visualizer.py -> build/lib.linux-aarch64-3.7/tokenizers/tools
#19 442.1 copying py_src/tokenizers/init.pyi -> build/lib.linux-aarch64-3.7/tokenizers
#19 442.1 copying py_src/tokenizers/models/init.pyi -> build/lib.linux-aarch64-3.7/tokenizers/models
#19 442.1 copying py_src/tokenizers/decoders/init.pyi -> build/lib.linux-aarch64-3.7/tokenizers/decoders
#19 442.1 copying py_src/tokenizers/normalizers/init.pyi -> build/lib.linux-aarch64-3.7/tokenizers/normalizers
#19 442.1 copying py_src/tokenizers/pre_tokenizers/init.pyi -> build/lib.linux-aarch64-3.7/tokenizers/pre_tokenizers
#19 442.1 copying py_src/tokenizers/processors/init.pyi -> build/lib.linux-aarch64-3.7/tokenizers/processors
#19 442.1 copying py_src/tokenizers/trainers/init.pyi -> build/lib.linux-aarch64-3.7/tokenizers/trainers
#19 442.1 copying py_src/tokenizers/tools/visualizer-styles.css -> build/lib.linux-aarch64-3.7/tokenizers/tools
#19 442.1 running build_ext
#19 442.1 error: can't find Rust compiler
#19 442.1
#19 442.1 If you are using an outdated pip version, it is possible a prebuilt wheel is available for this package but pip is not able to install from it. Installing from the wheel would avoid the need for a Rust compiler.
#19 442.1
#19 442.1 To update pip, run:
#19 442.1
#19 442.1 pip install --upgrade pip
#19 442.1
#19 442.1 and then retry package installation.
#19 442.1
#19 442.1 If you did intend to build this package from source, try installing a Rust compiler from your system package manager and ensure it is on the PATH during installation. Alternatively, rustup (available at https://rustup.rs) is the recommended way to download and update the Rust compiler toolchain.
#19 442.1 ----------------------------------------
#19 442.1 ERROR: Failed building wheel for tokenizers
#19 442.1 Successfully built backcall bs4 nltk pandocfilters prometheus-client pyrsistent pyzmq retrying spacy sentence-transformers tornado
#19 442.1 Failed to build tokenizers
#19 442.1 ERROR: Could not build wheels for tokenizers, which is required to install pyproject.toml-based projects
executor failed running [/bin/bash -o pipefail -c python -m pip install --upgrade pip && pip install -r requirements.txt]: exit code: 1
ERROR: Service 'notebooks' failed to build : Build failed
Right now, all notebooks are untrusted when you load them first time. Should we make them trusted? (docs)
If we're expecting that people will run the code in the notebooks, should we strip results from them, or leave committed?
With results in, it's a bit harder to do code reviews, etc.
We're using Spark 2.4 that is outdated & don't have many performance improvements. It's better to switch to Spark 3.x, and install PySpark via pip
instead of "manual" download.
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
Cell In[2], line 1
----> 1 product_description = pd.read_json("../data/temp/product_description.json")
2 signals = pd.read_json("../data/temp/signal_sample.json")
3 signals["query"] = signals["query_s"].apply(lambda x: re.sub("s$","", x.lower())) #conduct minimum stemming
File /opt/conda/lib/python3.10/site-packages/pandas/util/_decorators.py:211, in deprecate_kwarg.<locals>._deprecate_kwarg.<locals>.wrapper(*args, **kwargs)
209 else:
210 kwargs[new_arg_name] = new_arg_value
--> 211 return func(*args, **kwargs)
It doesn't work anyway...
[W 13:01:53.975 NotebookApp] Loading JupyterLab as a classic notebook (v6) extension.
[C 13:01:53.976 NotebookApp] You must use Jupyter Server v1 to load JupyterLab as notebook extension. You have v2.3.0 installed.
You can fix this by executing:
pip install -U "jupyter-server<2.0.0"
In the 3.ch10-pairwise-transform notebook, command 5 didn't display the chart with the call to plot_judgments.
I needed to run %matplotlib inline
to get the plot to show up.
IFrame shows connection refused even if web server is running.
But in general - it's not clear why we need it in this notebook...
Current version uses 4.0.2, while 4.0.3 is already available
Jupyter Lab provides more features, including tabbed browser, more functionality, etc.
If we switch to it, then we'll need to regenerate screenshots for Appendix A
Also, versions installed from the notebook conflict with installed in Jupyter
When you run docker-compose up
the environment setup fails because there are no images available for Mac OS.
docker-compose up at ๏ 11:52:45
[+] Running 0/1
โ ด zookeeper Pulling 2.5s
no matching manifest for linux/arm64/v8 in the manifest list entries
when running this part of the code
def download_outdoors_dataset():
from ltr.download import download, extract_tgz
import tarfile
dataset = ['https://github.com/ai-powered-search/outdoors/raw/master/outdoors.tgz']
download(dataset, dest='data/')
extract_tgz('data/outdoors.tgz') # -> Holds 'outdoors.csv', a big CSV file of the stackexchange outdoors dataset
Receiving this error:
ReadError: not a gzip file
Inside the gz file is a bunch of html I do not see any csv files
Building the image docker-notebooks
fails:
docker-compose up
Fails with error:
[docker-notebooks 4/16] COPY requirements.txt ./ 0.0s
=> ERROR [docker-notebooks 5/16] RUN python -m pip --no-cache-dir install --upgrade pip && pip --no-cache-dir install -r requirements.txt
#0 161.3 running build_ext
#0 161.3 running build_rust
#0 161.3 error: can't find Rust compiler
#0 161.3
#0 161.3 If you are using an outdated pip version, it is possible a prebuilt wheel is available for this package but pip is not able to install from it. Installing from the wheel would avoid the need for a Rust compiler.
#0 161.3
#0 161.3 To update pip, run:
#0 161.3
#0 161.3 pip install --upgrade pip
#0 161.3
#0 161.3 and then retry package installation.
#0 161.3
#0 161.3 If you did intend to build this package from source, try installing a Rust compiler from your system package manager and ensure it is on the PATH during installation. Alternatively, rustup (available at https://rustup.rs) is the recommended way to download and update the Rust compiler toolchain.
#0 161.3 [end of output]
#0 161.3
#0 161.3 note: This error originates from a subprocess, and is likely not a problem with pip.
#0 161.3 ERROR: Failed building wheel for tokenizers
#0 161.3 Building wheel for sacremoses (setup.py): started
#0 161.6 Building wheel for sacremoses (setup.py): finished with status 'done'
#0 161.6 Created wheel for sacremoses: filename=sacremoses-0.0.53-py3-none-any.whl size=895260 sha256=c0570d73297a0d503f9b7d64e9302ac5028a9d10417ed3375cbb8b0027e98382
#0 161.6 Stored in directory: /tmp/pip-ephem-wheel-cache-bg0l1kp3/wheels/42/79/78/5ad3b042cb2d97c294535162cdbaf9b167e3b186eae55ab72d
#0 161.6 Successfully built bs4 pysolr retrying sentence-transformers spacy sacremoses
#0 161.6 Failed to build tokenizers
#0 161.6 ERROR: Could not build wheels for tokenizers, which is required to install pyproject.toml-based projects
------
failed to solve: executor failed running [/bin/bash -o pipefail -c python -m pip --no-cache-dir install --upgrade pip && pip --no-cache-dir install -r requirements.txt]: exit code: 1
When I issue the command "docker-compose down" everything appears to run fine until it comes time to install python packages for the notebooks service. At this point, I receive: "ERROR: Could not install packages due to an OSError: [Errno 28] No space left on device". I don't think that my device lacks the memory or RAM. Any suggestions what might be causing this issue? Is there any chance you could include system requirements on the README as well?
Some dependencies that are specified in the requirements.txt
, like, transformers
are later replaced by installing https://github.com/explosion/spacy-experimental/releases/download/v0.6.1/en_coreference_web_trf-3.4.0a2-py3-none-any.whl
:
en-coreference-web-trf-3.4.0a2 spacy-alignments-0.9.1 spacy-transformers-1.1.9 transformers-4.25.1
We either need to add a comment, or add a code that will clone & unpack this dataset.
Hi, when trying to run Listing 5.1 via the provided docker-compose image on an Apple M2 machine, I get an exception (below). Other sections of the notebooks are working as expected, and the same code works correctly on an x86_64-based Linux system, also running via docker-compose. Perhaps an issue with spaCy or PyTorch support on that architecture? (I can offer to help test patches, but I'm not sure how to further triage the underlying issue.)
Specific notebook entry:
exception
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
Cell In[8], line 16
13 facts.extend(resolve_facts(sentence)) # subj:(Companies), rel:(employ), obj:(Data Scientists)
14 return facts
---> 16 graph = generate_graph(text)
17 for i in graph: print(i)
Cell In[8], line 7, in generate_graph(text)
6 def generate_graph(text):
----> 7 doc = coref_model(text)
8 doc = resolve_coreferences(doc) # "they" => "Data Scientists"
9 sentences = get_sentences(lang_model(doc)) # Data Scientists also write code. => ['nsubj, 'advmod', ROOT', 'dobj', 'punct']
File /opt/conda/lib/python3.10/site-packages/spacy/language.py:1031, in Language.__call__(self, text, disable, component_cfg)
1029 raise ValueError(Errors.E109.format(name=name)) from e
1030 except Exception as e:
-> 1031 error_handler(name, proc, [doc], e)
1032 if not isinstance(doc, Doc):
1033 raise ValueError(Errors.E005.format(name=name, returned_type=type(doc)))
File /opt/conda/lib/python3.10/site-packages/spacy/util.py:1670, in raise_error(proc_name, proc, docs, e)
1669 def raise_error(proc_name, proc, docs, e):
-> 1670 raise e
File /opt/conda/lib/python3.10/site-packages/spacy/language.py:1026, in Language.__call__(self, text, disable, component_cfg)
1024 error_handler = proc.get_error_handler()
1025 try:
-> 1026 doc = proc(doc, **component_cfg.get(name, {})) # type: ignore[call-arg]
1027 except KeyError as e:
1028 # This typically happens if a component is not initialized
1029 raise ValueError(Errors.E109.format(name=name)) from e
File /opt/conda/lib/python3.10/site-packages/spacy/pipeline/trainable_pipe.pyx:56, in spacy.pipeline.trainable_pipe.TrainablePipe.__call__()
File /opt/conda/lib/python3.10/site-packages/spacy/util.py:1670, in raise_error(proc_name, proc, docs, e)
1669 def raise_error(proc_name, proc, docs, e):
-> 1670 raise e
File /opt/conda/lib/python3.10/site-packages/spacy/pipeline/trainable_pipe.pyx:52, in spacy.pipeline.trainable_pipe.TrainablePipe.__call__()
File /opt/conda/lib/python3.10/site-packages/spacy_experimental/coref/coref_component.py:153, in CoreferenceResolver.predict(self, docs)
150 out.append([])
151 continue
--> 153 scores, idxs = self.model.predict([doc])
154 # idxs is a list of mentions (start / end idxs)
155 # each item in scores includes scores and a mapping from scores to mentions
156 ant_idxs = idxs
File /opt/conda/lib/python3.10/site-packages/thinc/model.py:334, in Model.predict(self, X)
330 def predict(self, X: InT) -> OutT:
331 """Call the model's `forward` function with `is_train=False`, and return
332 only the output, instead of the `(output, callback)` tuple.
333 """
--> 334 return self._func(self, X, is_train=False)[0]
File /opt/conda/lib/python3.10/site-packages/thinc/layers/chain.py:54, in forward(model, X, is_train)
52 callbacks = []
53 for layer in model.layers:
---> 54 Y, inc_layer_grad = layer(X, is_train=is_train)
55 callbacks.append(inc_layer_grad)
56 X = Y
File /opt/conda/lib/python3.10/site-packages/thinc/model.py:310, in Model.__call__(self, X, is_train)
307 def __call__(self, X: InT, is_train: bool) -> Tuple[OutT, Callable]:
308 """Call the model's `forward` function, returning the output and a
309 callback to compute the gradients via backpropagation."""
--> 310 return self._func(self, X, is_train=is_train)
File /opt/conda/lib/python3.10/site-packages/spacy_experimental/coref/coref_model.py:85, in coref_forward(model, X, is_train)
84 def coref_forward(model: Model, X, is_train: bool):
---> 85 return model.layers[0](X, is_train)
File /opt/conda/lib/python3.10/site-packages/thinc/model.py:310, in Model.__call__(self, X, is_train)
307 def __call__(self, X: InT, is_train: bool) -> Tuple[OutT, Callable]:
308 """Call the model's `forward` function, returning the output and a
309 callback to compute the gradients via backpropagation."""
--> 310 return self._func(self, X, is_train=is_train)
File /opt/conda/lib/python3.10/site-packages/thinc/layers/pytorchwrapper.py:225, in forward(model, X, is_train)
222 convert_outputs = model.attrs["convert_outputs"]
224 Xtorch, get_dX = convert_inputs(model, X, is_train)
--> 225 Ytorch, torch_backprop = model.shims[0](Xtorch, is_train)
226 Y, get_dYtorch = convert_outputs(model, (X, Ytorch), is_train)
228 def backprop(dY: Any) -> Any:
File /opt/conda/lib/python3.10/site-packages/thinc/shims/pytorch.py:97, in PyTorchShim.__call__(self, inputs, is_train)
95 return self.begin_update(inputs)
96 else:
---> 97 return self.predict(inputs), lambda a: ...
File /opt/conda/lib/python3.10/site-packages/thinc/shims/pytorch.py:115, in PyTorchShim.predict(self, inputs)
113 with torch.no_grad():
114 with torch.cuda.amp.autocast(self._mixed_precision):
--> 115 outputs = self._model(*inputs.args, **inputs.kwargs)
116 self._model.train()
117 return outputs
File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1518, in Module._wrapped_call_impl(self, *args, **kwargs)
1516 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
1517 else:
-> 1518 return self._call_impl(*args, **kwargs)
File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1527, in Module._call_impl(self, *args, **kwargs)
1522 # If we don't have any hooks, we want to skip the rest of the logic in
1523 # this function, and just call forward.
1524 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1525 or _global_backward_pre_hooks or _global_backward_hooks
1526 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1527 return forward_call(*args, **kwargs)
1529 try:
1530 result = None
File /opt/conda/lib/python3.10/site-packages/spacy_experimental/coref/pytorch_coref_model.py:88, in CorefClusterer.forward(self, word_features)
85 top_rough_scores_batch = top_rough_scores[i : i + batch_size]
87 # a_scores_batch [batch_size, n_ants]
---> 88 a_scores_batch = self.ana_scorer(
89 all_mentions=words,
90 mentions_batch=words_batch,
91 pairwise_batch=pairwise_batch,
92 top_indices_batch=top_indices_batch,
93 top_rough_scores_batch=top_rough_scores_batch,
94 )
95 a_scores_lst.append(a_scores_batch)
97 coref_scores = torch.cat(a_scores_lst, dim=0)
File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1518, in Module._wrapped_call_impl(self, *args, **kwargs)
1516 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
1517 else:
-> 1518 return self._call_impl(*args, **kwargs)
File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1527, in Module._call_impl(self, *args, **kwargs)
1522 # If we don't have any hooks, we want to skip the rest of the logic in
1523 # this function, and just call forward.
1524 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1525 or _global_backward_pre_hooks or _global_backward_hooks
1526 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1527 return forward_call(*args, **kwargs)
1529 try:
1530 result = None
File /opt/conda/lib/python3.10/site-packages/spacy_experimental/coref/pytorch_coref_model.py:165, in AnaphoricityScorer.forward(self, all_mentions, mentions_batch, pairwise_batch, top_indices_batch, top_rough_scores_batch)
160 pair_matrix = self._get_pair_matrix(
161 all_mentions, mentions_batch, pairwise_batch, top_indices_batch
162 )
164 # [batch_size, n_ants]
--> 165 scores = top_rough_scores_batch + self._ffnn(pair_matrix)
166 scores = add_dummy(scores, eps=True)
168 return scores
File /opt/conda/lib/python3.10/site-packages/spacy_experimental/coref/pytorch_coref_model.py:175, in AnaphoricityScorer._ffnn(self, x)
170 def _ffnn(self, x: torch.Tensor) -> torch.Tensor:
171 """
172 x: tensor of shape (batch_size x rough_k x n_features
173 returns: tensor of shape (batch_size x antecedent_limit)
174 """
--> 175 x = self.out(self.hidden(x))
176 return x.squeeze(2)
File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1518, in Module._wrapped_call_impl(self, *args, **kwargs)
1516 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
1517 else:
-> 1518 return self._call_impl(*args, **kwargs)
File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1527, in Module._call_impl(self, *args, **kwargs)
1522 # If we don't have any hooks, we want to skip the rest of the logic in
1523 # this function, and just call forward.
1524 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1525 or _global_backward_pre_hooks or _global_backward_hooks
1526 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1527 return forward_call(*args, **kwargs)
1529 try:
1530 result = None
File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/linear.py:114, in Linear.forward(self, input)
113 def forward(self, input: Tensor) -> Tensor:
--> 114 return F.linear(input, self.weight, self.bias)
RuntimeError: could not create a primitive descriptor for a matmul primitive
With this docker-composer.yaml
file I get the following error:
(base) raphy@pc:~/ai-powered-search/docker$ docker-compose up
ERROR: Version in "./docker-compose.yml" is unsupported. You might be seeing this error because you're using the wrong Compose file version. Either specify a supported version (e.g "2.2" or "3.3") and place your service definitions under the `services` key, or omit the `version` key and place your service definitions at the root of the file to use version 1.
For more on the Compose file format versions, see https://docs.docker.com/compose/compose-file/
(base) raphy@pc:~/ai-powered-search/docker$ sudo docker-compose --version
docker-compose version 1.25.0, build unknown
(base) raphy@pc:~/ai-powered-search/docker$ sudo docker --version
Docker version 20.10.11, build dea9396
Commentin tin docker-compose.yml
the line of the version:
#version: '3.8'
I get this error:
(base) raphy@pc:~/ai-powered-search/docker$ docker-compose up
ERROR: The Compose file './docker-compose.yml' is invalid because:
Unsupported config option for services: 'solr'
Unsupported config option for networks: 'zk-solr'
O.S. : Ubuntu 20.04
How to solve the problem?
13.16 in ch13/4.semantic-search.ipynb
is 13.14 in the manuscript. We need to do adjustments
Cell 4 uses a variable that isn't defined
aggr_signals = aggr_signals[aggr_signals["count"] > 1]
aggr_signals.shape[0]
gives
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[4], line 1
----> 1 aggr_signals = aggr_signals[aggr_signals["count"] > 1]
2 aggr_signals.shape[0]
NameError: name 'aggr_signals' is not defined
We need to add a warning that this is normal...
Wiping 'tmdb' collection
[('action', 'CREATE'), ('name', 'tmdb'), ('numShards', 1), ('replicationFactor', 1)]
Creating 'tmdb' collection
Status: Success
Del/Adding LTR QParser for tmdb collection
<Response [400]>
Status: Failure; Response:[ {'responseHeader': {'status': 400, 'QTime': 1}, 'errorMessages': ["error processing commands, errors: [{delete-queryparser=ltr, errorMessages=[NO such queryParser 'ltr' ]}], \n"], 'WARNING': 'This response format is experimental. It is likely to change in the future.', 'error': {'metadata': ['error-class', 'org.apache.solr.api.ApiBag$ExceptionWithErrObject', 'root-error-class', 'org.apache.solr.api.ApiBag$ExceptionWithErrObject'], 'details': [{'delete-queryparser': 'ltr', 'errorMessages': ["NO such queryParser 'ltr' "]}], 'msg': "error processing commands, errors: [{delete-queryparser=ltr, errorMessages=[NO such queryParser 'ltr' ]}], ", 'code': 400}} ]
Status: Success
Adding LTR Doc Transformer for tmdb collection
Status: Failure; Response:[ {'responseHeader': {'status': 400, 'QTime': 1}, 'errorMessages': ["error processing commands, errors: [{delete-transformer=features, errorMessages=[NO such transformer 'features' ]}], \n"], 'WARNING': 'This response format is experimental. It is likely to change in the future.', 'error': {'metadata': ['error-class', 'org.apache.solr.api.ApiBag$ExceptionWithErrObject', 'root-error-class', 'org.apache.solr.api.ApiBag$ExceptionWithErrObject'], 'details': [{'delete-transformer': 'features', 'errorMessages': ["NO such transformer 'features' "]}], 'msg': "error processing commands, errors: [{delete-transformer=features, errorMessages=[NO such transformer 'features' ]}], ", 'code': 400}} ]
If I'm trying to run get_category_and_term_vector_solr_response("kimchi")
, I'm getting:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[11], line 1
----> 1 get_category_and_term_vector_solr_response("kimchi")
Cell In[6], line 16, in get_category_and_term_vector_solr_response(keyword)
1 def get_category_and_term_vector_solr_response(keyword):
2 query = {
3 "params": { "fore": keyword, "back": "*:*", "df": "text_t" },
4 "query": "*:*", "limit": 0,
(...)
13 "type" : "terms", "field" : "doc_type", "limit": 1, "sort": { "r2": "desc" },
14 "facet" : { "r2" : "relatedness($fore,$back)" }}}}}}
---> 16 response = run_search(query)
17 return json.loads(response)
Cell In[8], line 12, in run_search(text)
11 def run_search(text):
---> 12 q = urllib.parse.quote(text)
13 qf, defType = "text_t", "lucene"
15 return requests.get(SOLR_URL + "/reviews/select?q=" + q + "&qf=" + qf + "&defType=" + defType).text
File /opt/conda/lib/python3.10/urllib/parse.py:869, in quote(string, safe, encoding, errors)
867 if errors is not None:
868 raise TypeError("quote() doesn't support 'errors' for bytes")
--> 869 return quote_from_bytes(string, safe)
File /opt/conda/lib/python3.10/urllib/parse.py:894, in quote_from_bytes(bs, safe)
889 """Like quote(), but accepts a bytes object rather than a str, and does
890 not perform string-to-bytes encoding. It always returns an ASCII string.
891 quote_from_bytes(b'abc def\x3f') -> 'abc%20def%3f'
892 """
893 if not isinstance(bs, (bytes, bytearray)):
--> 894 raise TypeError("quote_from_bytes() expected bytes")
895 if not bs:
896 return ''
TypeError: quote_from_bytes() expected bytes
Also, run_search
is defined later in the notebook, not before this function
Right now, the container_name
is hardcoded, so when new containers are built (i.e., code has changed) they don't replace previous one, and they need to be removed explicitly.
When doing next round of refactoring, we may need to get rid of the container_name
and just use service names as host names
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.