Giter Site home page Giter Site logo

juliaearth / geospatial-data-science-with-julia Goto Github PK

View Code? Open in Web Editor NEW
86.0 8.0 15.0 262.54 MB

Geospatial Data Science with Julia

Home Page: https://juliaearth.github.io/geospatial-data-science-with-julia

TeX 100.00%
book computational data geo geometry geospatial geostatistics julia science statistics

geospatial-data-science-with-julia's People

Contributors

bkamins avatar danielvandh avatar dependabot[bot] avatar eliascarv avatar erickchacon avatar juliohm avatar kylebeggs avatar maxdebayser avatar ronisbr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

geospatial-data-science-with-julia's Issues

Comment regarding GeoInterface

Congratulations on the book, it's looking very good! I'm looking forward to the sections still in progress.

While skimming along, I encountered the following in 3.3:

If we treated this geometry as a generic polygon represented by a vector of vertices in memory, like it is done in GeoInterface.jl for example, we wouldn’t be able to dispatch optimized code that is only valid for a triangle:

I agree with the overall point, generic representations of polygons from GIS are bad for any optimization. But I don't think the example make sense here, because GeoInterface can actually do what you want here, as it has a TriangleTrait, and we dispatch on it for things like npoint (https://juliageo.org/GeoInterface.jl/stable/guides/defaults/#Fallbacks). Besides, as an interface, it doesn't decide on how something should be represented in memory.

Quick example for a LineString:

julia> using GeoInterface
julia> l = GeoInterface.Wrappers.Line([[1,2], [1,2]])
GeoInterface.Wrappers.Line{false, false, Vector{Vector{Int64}}, Nothing, Nothing}([[1, 2], [1, 2]], nothing, nothing)

julia> @code_llvm GeoInterface.npoint(l)
;  @ /Users/evetion/.julia/packages/GeoInterface/8VGPL/src/interface.jl:146 within `npoint`
define i64 @julia_npoint_566({ {}* }* nocapture noundef nonnull readonly align 8 dereferenceable(8) %0) #0 {
top:
  ret i64 2
}

I would propose to remove the "like it's done in GeoInterface for example" from the sentence above. A replacement, if needed, could be like it's done in GIS/Simple Features representations (such as WKT/WKB, etc).

Add new features

The following features need a mention in the book:

  • tablejoin vs. geojoin
  • InterpolateMissing
  • DropNaN, InterpolateNaN
  • Aggregate, Transfer
  • Downscale, Upscale
  • viz/viz! and cbar

Various minor comments related to the text I noticed

In https://github.com/JuliaEarth/geospatial-data-science-with-julia/blob/main/01-geodata.qmd#L175:

We can check that the representation is a valid representation of a table using the `Tables.istable` function:

is not 100% accurate. Tables.istable can return false and still the passed object can be a valid table. I would just skip it. Tables.istable is meant mostly for package developers that know the internals of Tables.jl well.

This is a soft comment


In https://github.com/JuliaEarth/geospatial-data-science-with-julia/blob/main/01-geodata.qmd#L168C2-L168C2 I would explain why you use Symdol (and above you used strings). Also maybe consider using CategoricalArrays.jl if you feel that GENDER is categorical?

This is a soft comment


https://github.com/JuliaEarth/geospatial-data-science-with-julia/blob/main/01-geodata.qmd#L199

This row-major representation can be useful to process data that

Is not 100% accurate. You use "This" word. And the one you use does not have this property. The point is that in general row-wise representation is for larger than RAM data, but not this specific one you present. Also, often "larger than RAM" data will return Tables.istable as false, and only at run-time it is checked that data is indeed tabular.

Intersection graph in chapter 04 does not seem right.

In the section of operations there is an example of the intersection of 2 geometries. Is that graph right?

outer = [(8, 0), (4, 8), (2, 8), (-2, 0), (0, 0), (1, 2), (5, 2), (6, 0)]
inner = [(4, 4), (2, 4), (3, 6)]
poly  = PolyArea([outer, inner])
quad  = Quadrangle((0, 1), (3, 1), (3, 7), (0, 7))

int = poly  quad

viz([poly, quad, boundary(int)],
    color = ["slategray3", "teal", "red"],
    alpha = [1.0, 0.2, 1.0])

image

I think there is a red segment in the center that should not be there?

PDF rendering ?

Ok, I may look a dinosaur, but sometimes a PDF is useful :-)
Is it possible to render the book as a single PDF ?

Prompt the reader to download the data files

I may have missed it.
However, I got to this point without having downloaded the data files:

https://juliaearth.github.io/geospatial-data-science-with-julia/02-geoviz.html

I notice that they are here:
https://github.com/JuliaEarth/geospatial-data-science-with-julia/tree/main/data

But it would be good to prompted to download them to a suitable location at the start of the book, and perhaps reminded of this when they are used.

I see that you use GitHub Actions.
Perhaps you could add a step to zip the data directory and copy it to the quarto publish directory before publishing to GitHub pages. Then you could link it as:

https://juliaearth.github.io/geospatial-data-science-with-julia/data.zip

Proposal to improve table rendering

Hi!

For reasons I cannot remember (maybe @bkamins can), the headers in DataFrames output are aligned to the left whereas some numeric columns are aligned to the right. This works pretty well in the terminal and in Jupyter. However, the tables in Quarto are post-process to fill the entire space, leading to a bad representation as follows:

Captura de Tela 2023-11-01 às 11 30 12

Take a look how the same table is shown in Jupyter:

Captura de Tela 2023-11-01 às 13 15 06

Hence, my proposal is to change how DataFrames are shown here in the book. My proposal is to do the following:

` ` `{julia}
#| output: false
using DataFrames

df = DataFrame(
  NAME=["John", "Mary", "Paul", "Anne", "Kate"],
  AGE=[34, 12, 23, 39, 28],
  HEIGHT=[1.78, 1.56, 1.70, 1.80, 1.72],
  GENDER=["male", "female", "male", "female", "female"]
` ` `

` ` `{julia}
#| echo: false
show(stdout, MIME("text/html"), df; header_alignment = :c, alignment = :c)
` ` `

Leading to the much nicer version:

Captura de Tela 2023-11-01 às 13 52 37

P.S.: I tried to override Base.show in Quarto to reduce the work, but I couldn't.

No clear difference between theoretical and empirical correlogram

In the following paragraph of the introduction of chapter 10:

The sample Pearson correlation coefficient studied as a function of the lag $h$
is known as the **correlogram function**. For example, consider the exponential
correlogram function given by $cor(h) = \exp(-h)$:

The first sentence is defining the empirical correlogram function, but the example is providing a theoretical correlogram function. Maybe add a sentence before the example to present the theoretical correlogram? Or remove the word sample to refer to the theoretical correlogram?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.