The spark-notebook's discuss from spark-notebook

Simple scala html widgets

ul, li, table, etc (some are already in package.scala)

Open notebooks from any folder

In Ipython, the command ipython notebook can be called from any directory, launching a web service with all the .ipynb in the current directory and subdirs.

It would be useful to mantain the same approach, to store the notebooks within the projects / git repos / etc.

Integrate with bokeh

This works slightly started here.

It needs to be continued here, it's a great move and idea! This would ease the integration of drawing capabilities, at least a first and complete one.

Other ideas will be nvd3, c3.js and so on.

Data Connector/Browser

Widget that would be
File browser,
HDFS browser,
Tachyon browser
...

Specific context for Shell call

Add a new context :sh like :sql, :cp.

We can use scala.sys.process for execute them.

Validate that notebooks.dir conf works

Solve the assembly issue that unfiltered fails to serve some files

When using the assembly build using java -jar some problems happen that wouldn't in sbt.

The symptom is that some files cannot be found in the assembly by unfiltered (or netty?)

Dynamic Form are with default values

Form is loosing provided data. Although it could be great to store this information somewhere (added as Json somewhere in the block on the server side?).

Download notebook as code

Add a menu item to download the code part as a scala file. Maybe consider markdown as comments, :cp and :dp will be more complicated and could be left apart at first

Table preview of SchemaDD / DataFrame

Show the rows in the RDD in a table.

Even cooler would be to have a R's summary functionality on it.

Cursor size glitch

It happens in lines # > 1. The cursor is spanning more than one line in height, this is distracting me so much and is very uncomfortable.

Late/Lazy/Delayed startup of observable.js

This line fails when executed before IPython init → hence it breaks observable mechanism, which is almost everything interesting...

Change default port 9000

Hey guys, I hope this is the right spot to ask questions (and possibly raise an issue).

I would like to run spark-notebook on a server, also used as namenode of an Hadoop Cluster (for a proof-of-concept).
When starting spark-notebook, it produces an error:
"org.jboss.netty.channel.ChannelException: Failed to bind to: /0.0.0.0:9000"
Most likely, it cannot bind to that port, since Hadoop Filesystem is already bind to port 9000.

Is there a way to change the port of spark-notebook?

Best regards,
Benjamin

Dependencies are not updated

Executing "Update classpath and Spark's jars", it seems the dependencies are not updated as documented.

At the end of the execution, the key spark.jars is still empty.

Also, trying to add mllib with the following code doesn't have any effect:

:dp org.apache.spark  % "spark-mllib_2.10" % 1.2.0

Executing spark while "play run" is failing

This is a tricky one (again), because looks like it's only happening when running in Dev mode (like run).

Hence, it's still hard to debug or develop things involving spark execution thus (Spark, SparkInfo, Sql, and c°)... Which is sad.

So the symptom is that there is a mismatch while writing/reading Tasks by the closure serializer (as far as the current investigation discovered). The problem occurs for scala.Option, scala.None$. The serialiazed (Java) is reporting bad serial uid, but the show ids (see below) are for Option and None.type → that's normal. Although, the written bytes should only refer the characters None$ and never Option.

No idea, how this happens now, however, I guess that the Option field in Task is metrics.

[error] o.a.s.e.Executor - Exception in task 1.0 in stage 0.0 (TID 1)
java.io.InvalidClassException: scala.Option; local class incompatible: stream classdesc serialVersionUID = -2062608324514658839, local class serialVersionUID = 5081326844987135632
    at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:617) ~[na:1.7.0_72]
    at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1622) ~[na:1.7.0_72]
    at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517) ~[na:1.7.0_72]
    at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1622) ~[na:1.7.0_72]
    at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517) ~[na:1.7.0_72]
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771) ~[na:1.7.0_72]
[error] o.a.s.s.TaskSetManager - Task 7 in stage 0.0 failed 1 times; aborting job

Deal with Ivy deps (Aether)

Plug some stuffs in Deps.scala

S3 link format

Clicking the S3 link from: https://github.com/andypetrella/spark-notebook/releases

Gives me:

Is that expected, instead of a directory hierarchy?

Forked process won't work using `play run`

This is mainly a play (sbt) problem actually, it seems that it interferes with the classpath construction and loading, hence the classes are not found in the forked process.

Some stuffs have been tried so far, like this, however if now the process can find the classes, it fails weirdly at runtime:
cannot find function f in StringContext.

The thing is that f is actually a macro and should have been injected in the class' bytecode (AFAIK) hence it should be resolved.

A clash in the scala version is one of the potential problem, but not sure.

Run notebook on mesos

"Executor Spark home spark.mesos.executor.home is not set!" even if spark.executor.uri is correctly set

Restore downloads as Snb + Scala

The switch to the new IPython UI broke it

Using up arrow (`↑`) on second line of input moves to previous block

This is very annoying... need to play with → and ← and end to get to the first character of a block...

Failed to initialize compiler: object scala.runtime in compiler mirror not found

I get "Failed to initialize compiler: object scala.runtime in compiler mirror not found" errors.
What I did is just ran

sbt run

And tried simplest example

`Deps.resolveAndAddToJars` should also be able to act like `:cp`

Painful to have to download the deps then still have to create the according :cp block.

So resolve should have a parameter that will update the classpath in one go.

Spark 1.2

It'll need this: https://issues.apache.org/jira/browse/SPARK-4923

But also, akka has been bumped to a new version → switch to Play 2.3 should be okay now (hence some adaptations in the websocket would be nice, like using actors right away).

Shutdown remote processes

when page is closed or reloaded, the remote process should be shut down.

Notebook showing how to use GeoTrellis

that would be a great show up
/cc @lossyrob ^^

Header glitch -- disappearing oO

Sometimes, after runs or things like these (even just at first load), the header is not visible and screws the UI.

It requires a click somewhere around the banner. This is a classical symptom of bad dom/css manip.

Improve doc with new features

docker installation
data table, :sh, hadoop and version builds
so on and so forth

Sanitize the themes

LSS: lots of css imports...

Longer:
Quite some different themes are used, like jquery-ui, bootstrap.
Also libs like bokeh are defining clashing class names (with jquery-ui AFAIK) and are either shipping or relying on certain css libs (jquerui / boostrap?)

Create small website...

Using :cp looses env

Since using :cp will restart the REPL the env (variables mainly created from the history) are lost.

Solution is to ask a new REPL by passing the history of the previous one...

Enhance Spark feedback

The current is very basic and can be found here.

In the near future such thingy should be even easier to have → apache/spark#3009

SparkR integration

Need a new context (:r?)

Need to see how to integrate and use SparkR from scala.

S3 Wagon for Aether

Plug some stuffs in Deps.scala.

Ref(?): http://site.kuali.org/maven/wagons/maven-s3-wagon/1.1.20/

Add twitter stream example

would be so fun, mostly if it has an evolving graph

Create a print css

When printing only a part of the notebook is printed, the rest is actually hidden

Dockerized Container

Would be nice to spin it up in a VM or a container, on kubernetes or Mesos.
I've started working on it, but it's a bit of a hack for the command.
Security, easy url, and sbt are a little painful;
I'l send a Pr when I can.

Enable interpolation in Markup block

It's painful to use <h2></h2> code in code blocks because we need to render a local variable.

The idea here is to make the local variables accessible to the markup renderer.

Register predefined SparkContexts

And update the main page to choose which SparkContext to use when a notebook is opened.

Within a notebook, allow Spark to choose one of them.

no response to statement execution

I am testing using the "Simple spark" example . When executing any command there is no response.

terminal-log

Embedded server listening at
http://0.0.0.0:8899
Press any key to stop.
log4j:ERROR Could not find value for key log4j.appender.console
log4j:ERROR Could not instantiate appender named "console".
log4j:WARN No appenders could be found for logger (notebook.kernel.pfork.BetterFork$).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
[WARN] [12/20/2014 16:32:12.241] [main] [EventStream(akka://Remote)] [akka.event-handlers] config is deprecated, use [akka.loggers]

Git Clone on Windows machine

error: unable to create file Spark+on+Mesos+using+C*.snb (Invalid argument)

probbaly some irregular char in the file name.

Can I submit a PR ?

spark-notebook / spark-notebook Goto Github PK

spark-notebook's Issues

Recommend Projects

Recommend Topics

Recommend Org