Giter Site home page Giter Site logo

google / ffn Goto Github PK

View Code? Open in Web Editor NEW
292.0 22.0 97.0 3.51 MB

Flood-Filling Networks for instance segmentation in 3d volumes.

License: Apache License 2.0

Python 98.80% Jupyter Notebook 1.20%
connectomics segmentation ffn flood-filling-networks instance-segmentation

ffn's Introduction

Flood-Filling Networks

Flood-Filling Networks (FFNs) are a class of neural networks designed for instance segmentation of complex and large shapes, particularly in volume EM datasets of brain tissue.

For more details, see the related publications:

This is not an official Google product.

Installation

No installation is required. To install the necessary dependencies, run:

  pip install -r requirements.txt

The code has been tested on an Ubuntu 16.04.3 LTS system equipped with a Tesla P100 GPU.

Training

FFN networks can be trained with the train.py script, which expects a TFRecord file of coordinates at which to sample data from input volumes.

Preparing the training data

There are two scripts to generate training coordinate files for a labeled dataset stored in HDF5 files: compute_partitions.py and build_coordinates.py.

compute_partitions.py transforms the label volume into an intermediate volume where the value of every voxel A corresponds to the quantized fraction of voxels labeled identically to A within a subvolume of radius lom_radius centered at A. lom_radius should normally be set to (fov_size // 2) + deltas (where fov_size and deltas are FFN model settings). Every such quantized fraction is called a partition. Sample invocation:

  python compute_partitions.py \
    --input_volume third_party/neuroproof_examples/validation_sample/groundtruth.h5:stack \
    --output_volume third_party/neuroproof_examples/validation_sample/af.h5:af \
    --thresholds 0.025,0.05,0.075,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9 \
    --lom_radius 24,24,24 \
    --min_size 10000

build_coordinates.py uses the partition volume from the previous step to produce a TFRecord file of coordinates in which every partition is represented approximately equally frequently. Sample invocation:

  python build_coordinates.py \
     --partition_volumes validation1:third_party/neuroproof_examples/validation_sample/af.h5:af \
     --coordinate_output third_party/neuroproof_examples/validation_sample/tf_record_file \
     --margin 24,24,24

Sample data

We provide a sample coordinate file for the FIB-25 validation1 volume included in third_party. Due to its size, that file is hosted in Google Cloud Storage. If you haven't used it before, you will need to install the Google Cloud SDK and set it up with:

  gcloud auth application-default login

You will also need to create a local copy of the labels and image with:

  gsutil rsync -r -x ".*.gz" gs://ffn-flyem-fib25/ third_party/neuroproof_examples

Running training

Once the coordinate files are ready, you can start training the FFN with:

  python train.py \
    --train_coords gs://ffn-flyem-fib25/validation_sample/fib_flyem_validation1_label_lom24_24_24_part14_wbbox_coords-*-of-00025.gz \
    --data_volumes validation1:third_party/neuroproof_examples/validation_sample/grayscale_maps.h5:raw \
    --label_volumes validation1:third_party/neuroproof_examples/validation_sample/groundtruth.h5:stack \
    --model_name convstack_3d.ConvStack3DFFNModel \
    --model_args "{\"depth\": 12, \"fov_size\": [33, 33, 33], \"deltas\": [8, 8, 8]}" \
    --image_mean 128 \
    --image_stddev 33

Note that both training and inference with the provided model are computationally expensive processes. We recommend a GPU-equipped machine for best results, particularly when using the FFN interactively in a Jupyter notebook. Training the FFN as configured above requires a GPU with 12 GB of RAM. You can reduce the batch size, model depth, fov_size, or number of features in the convolutional layers to reduce the memory usage.

The training script is not configured for multi-GPU or distributed training. For instructions on how to set this up, see the documentation on Distributed TensorFlow.

Inference

We provide two examples of how to run inference with a trained FFN model. For a non-interactive setting, you can use the run_inference.py script:

  python run_inference.py \
    --inference_request="$(cat configs/inference_training_sample2.pbtxt)" \
    --bounding_box 'start { x:0 y:0 z:0 } size { x:250 y:250 z:250 }'

which will segment the training_sample2 volume and save the results in the results/fib25/training2 directory. Two files will be produced: seg-0_0_0.npz and seg-0_0_0.prob. Both are in the npz format and contain a segmentation map and quantized probability maps, respectively. In Python, you can load the segmentation as follows:

  from ffn.inference import storage
  seg, _ = storage.load_segmentation('results/fib25/training2', (0, 0, 0))

We provide sample segmentation results in results/fib25/sample-training2.npz. For the training2 volume, segmentation takes ~7 min with a P100 GPU.

For an interactive setting, check out ffn_inference_colab_demo.ipynb. This Colab notebook shows how to segment a single object with an explicitly defined seed and visualize the results while inference is running.

Both examples are configured to use a 3d convstack FFN model trained on the validation1 volume of the FIB-25 dataset from the FlyEM project at Janelia.

Further information

Please see doc/manual.md.

ffn's People

Contributors

aschampion avatar cclauss avatar chinasaur avatar imxj avatar keceli avatar mjanusz avatar pgunn avatar sherryxding avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ffn's Issues

TypeError: string indices must be integers

The sentence
import ffn prices = ffn.get('^IBEX', start='2010-12-30', end = '2019-10-29') prices
Returns the error
`---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/tmp/ipykernel_2612/2489336543.py in
1 import ffn
2
----> 3 prices = ffn.get('aapl,msft', start='2010-01-01')

~/anaconda3/envs/enri/lib/python3.9/site-packages/decorator.py in fun(*args, **kw)
230 if not kwsyntax:
231 args, kw = fix(args, kw, sig)
--> 232 return caller(func, *(extras + args), **kw)
233 fun.name = func.name
234 fun.doc = func.doc

~/anaconda3/envs/enri/lib/python3.9/site-packages/ffn/utils.py in _memoize(func, *args, **kw)
32 return cache[key]
33 else:
---> 34 cache[key] = result = func(*args, **kw)
35 return result
36

~/anaconda3/envs/enri/lib/python3.9/site-packages/ffn/data.py in get(tickers, provider, common_dates, forward_fill, clean_tickers, column_names, ticker_field_sep, mrefresh, existing, **kwargs)
74 # call provider - check if supports memoization
75 if hasattr(provider, "mcache"):
---> 76 data[ticker] = provider(ticker=t, field=f, mrefresh=mrefresh, **kwargs)
77 else:
78 data[ticker] = provider(ticker=t, field=f, **kwargs)

~/anaconda3/envs/enri/lib/python3.9/site-packages/decorator.py in fun(*args, **kw)
230 if not kwsyntax:
231 args, kw = fix(args, kw, sig)
--> 232 return caller(func, *(extras + args), **kw)
233 fun.name = func.name
234 fun.doc = func.doc

~/anaconda3/envs/enri/lib/python3.9/site-packages/ffn/utils.py in _memoize(func, *args, **kw)
32 return cache[key]
33 else:
---> 34 cache[key] = result = func(*args, **kw)
35 return result
36

~/anaconda3/envs/enri/lib/python3.9/site-packages/ffn/data.py in yf(ticker, field, start, end, mrefresh)
138 field = "Adj Close"
139
--> 140 tmp = pdata.get_data_yahoo(ticker, start=start, end=end)
141
142 if tmp is None:

~/anaconda3/envs/enri/lib/python3.9/site-packages/pandas_datareader/data.py in get_data_yahoo(*args, **kwargs)
78
79 def get_data_yahoo(*args, **kwargs):
---> 80 return YahooDailyReader(*args, **kwargs).read()
81
82

~/anaconda3/envs/enri/lib/python3.9/site-packages/pandas_datareader/base.py in read(self)
251 # If a single symbol, (e.g., 'GOOG')
252 if isinstance(self.symbols, (string_types, int)):
--> 253 df = self._read_one_data(self.url, params=self._get_params(self.symbols))
254 # Or multiple symbols, (e.g., ['GOOG', 'AAPL', 'MSFT'])
255 elif isinstance(self.symbols, DataFrame):

~/anaconda3/envs/enri/lib/python3.9/site-packages/pandas_datareader/yahoo/daily.py in _read_one_data(self, url, params)
151 try:
152 j = json.loads(re.search(ptrn, resp.text, re.DOTALL).group(1))
--> 153 data = j["context"]["dispatcher"]["stores"]["HistoricalPriceStore"]
154 except KeyError:
155 msg = "No data fetched for symbol {} using {}"

TypeError: string indices must be integers`

What can be the cause?. I will appreciate help

AxisError with numpy 1.18

I got the following error, when I tried to run FFN training with numpy 1.18:
numpy.AxisError: axis 4 is out of bounds for array of dimension 4

Looks like it is due to a recent change in numpy.expand_dims. See https://numpy.org/doc/stable/reference/generated/numpy.expand_dims.html.

Here is a patch to fix the error:

diff --git a/ffn/training/inputs.py b/ffn/training/inputs.py
index d1b5c31..b9a9c7e 100644
--- a/ffn/training/inputs.py
+++ b/ffn/training/inputs.py
@@ -152,7 +152,7 @@ def load_from_numpylike(coordinates, volume_names, shape, volume_map,
     if data.ndim == 4:
       data = np.rollaxis(data, 0, start=4)
     else:
-      data = np.expand_dims(data, 4)
+      data = np.expand_dims(data, data.ndim)

     # Add flat batch dim and return.
     data = np.expand_dims(data, 0)
--

Realignment and Irregular section substitution

In the paper "Automated Reconstruction of a Serial-Section EM Drosophila Brain with Flood-Filling Networks and Local Realignment" there is mention of a local realignment and an irregular section substitution step before FFN is run. Is that code available somewhere? I couldn't find it in this repository

Undefined names: 'sampling' and '_required'

flake8 testing of https://github.com/google/ffn on Python 3.7.1

$ flake8 . --count --select=E9,F63,F72,F82 --show-source --statistics

./ffn/inference/resegmentation_analysis.py:256:7: F821 undefined name 'sampling'
      sampling, result.eval.from_a)
      ^
./ffn/inference/resegmentation_analysis.py:259:7: F821 undefined name 'sampling'
      sampling, result.eval.from_b)
      ^
./ffn/utils/bounding_box.py:374:17: F821 undefined name '_required'
          yield _required(self.start_to_box((x, y, z)))
                ^
./ffn/utils/bounding_box.py:395:11: F821 undefined name '_required'
          _required(self.index_to_sub_box(i)) for i in range(i_begin, i_end))
          ^
4     F821 undefined name 'sampling'
4

E901,E999,F821,F822,F823 are the "showstopper" flake8 issues that can halt the runtime with a SyntaxError, NameError, etc. These 5 are different from most other flake8 issues which are merely "style violations" -- useful for readability but they do not effect runtime safety.

  • F821: undefined name name
  • F822: undefined name name in __all__
  • F823: local variable name referenced before assignment
  • E901: SyntaxError or IndentationError
  • E999: SyntaxError -- failed to compile a file into an Abstract Syntax Tree

ffn_inference_colab_demo issue

IndexError Traceback (most recent call last)
in <cell line: 6>()
4 else:
5 vis_update = 1
----> 6 canvas.segment_at((125, 125, 125), dynamic_image=inference.DynamicImage(),vis_update_every=vis_update)

1 frames
/content/ffn/ffn/inference/inference.py in update_at(self, pos, start_pos)
411 end = start + self._input_seed_size
412 logit_seed = np.array(
--> 413 self.seed[[slice(s, e) for s, e in zip(start, end)]])
414 init_prediction = np.isnan(logit_seed)
415 logit_seed[init_prediction] = np.float32(self.options.pad_value)

IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices

TensorFlow record files are corrupted

I'm trying to train an FFN, and the first 2 steps (partition and build the coordinate file) seem to go fine, but training throws Key Value errors. On further inspection (using TFRecord Viewer), I get this error:
tensorflow.python.framework.errors_impl.DataLossError: corrupted record at 0.

Any help would be super appreciated!

TF version 1.13.2, and here are the exact calls I'm making:

For computing partitions:
python ../../ffn/compute_partitions.py \ --input_volume ../training_data_img.h5:label \ --output_volume training_data2.h5:af \ --thresholds 0.025,0.05,0.075,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9 \ --lom_radius 3,3,3 \ --min_size 10000

For building the TFRecord file:
python ../../ffn/build_coordinates.py \ --partition_volumes validation:training_data2.h5:af \ --coordinate_output tf_record_fileyw \ --margin 3,3,3

and for training:
python ../../ffn/train.py \ --train_coords tf_record_fileyw \ --data_volumes validation1:../training_data_img.h5:image \ --label_volumes validation1:../training_data_img.h5:label \ --model_name convstack_3d.ConvStack3DFFNModel \ --model_args "{\"depth\": 2, \"fov_size\": [2, 2, 2], \"deltas\": [2, 2, 2]}" \ --image_mean 72 \ --image_stddev 33

Parameter Optimisation Recommendations

Hi!
I am running your FFN model with some data from the Allen brain atlas, the dataset is 100x100x100 pixels and look like this:

Baseline:
Screenshot 2020-10-22 at 14 34 42

Groundtruth:
Screenshot 2020-10-22 at 14 34 48

See the data here: https://drive.google.com/drive/folders/1TnhPA7sqIJj_KKC1bHS4zOrOdKKHY0vm?usp=sharing

Do you have any recommendations for which parameters in the model I should change in the FFN code to accommodate the differences of my dataset compared to the example EM dataset (such as the smaller size)?

Thank you!

Qustion about blank areas in inference labels

Hi,

I am working on FFN network training for a couple of weeks. However, the inference result is still not ideal. Large neurons are detected properly while many tiny neurons are left blank. I am wondering are there any method to refill these small blank areas?

Would appreciate any suggestions and insights.

Multi-GPU utilization

I am looking for a way to correctly applying multi-GPU to train and inference.

I am now using multi-GPU to inference a large volume data separately. The labels generated by multi GPU turns out independent, and it's still confusing how I can combine them into a large one.

Really appreciate any help or insights.

Training model does not use GPU

Ubuntu16.04
tensorflow-gpu 1.12
cuda 9.0
cudnn 7.4

TF can only detect CPU:
/job:localhost/replica:0/task:0/device:CPU:0

Training model does not use GPU.

What about consensus and agglomeration steps?

Hello,

Can someone please explain about the codes corresponding to the steps consensus and agglomeration (as in the nature paper) that follows the inference.
https://github.com/google/ffn/blob/master/ffn/inference/consensus.py and https://github.com/google/ffn/blob/master/ffn/inference/resegmentation.py seems to have routines associated with these two steps. However, there is no instruction about the usage of these modules. Has anyone succeeded in running these two steps to yield the so called output ffn-c?

Thanks,

Issue with build_coordinates.py

Hello, I installed ffn and downloaded the sample data without issue. I am trying to train the sample model and got an error.

I ran this:

(ffn) user:~/ffn-master$ runffn.sh
#!/bin/bash
python compute_partitions.py
--input_volume ~/third_party/neuroproof_examples/validation_sample/groundtruth.h5:stack
--output_volume ~/third_party/neuroproof_examples/validation_sample/af.h5:af
--thresholds 0.025,0.05,0.075,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9
--lom_radius 24,24,24
--min_size 10000
python build_coordinates.py
--partition_volumes ~/third_party/neuroproof_examples/validation_sample/af.h5:af
--coordinate_output ~/third_party/neuroproof_examples/validation_sample/tf_record_file
--margin 24,24,24

compute_partitions.py runs fine, but with build_coordinates I get this:

Traceback (most recent call last):
File "build_coordinates.py", line 110, in
app.run(main)
File "/home/ncmir-lab/.conda/envs/ffn/lib/python3.6/site-packages/absl/app.py", line 300, in run
_run_main(main, args)
File "/home/ncmir-lab/.conda/envs/ffn/lib/python3.6/site-packages/absl/app.py", line 251, in _run_main
sys.exit(main(argv))
File "build_coordinates.py", line 62, in main
name, path, dataset = partvol.split(':')
ValueError: not enough values to unpack (expected 3, got 2)

Evaluation the segmentation

In the paper "High-Precision Automated Reconstruction of Neurons with Flood-filling Networks" the result is evaluated through edge accuracy and expected run length. I noticed that in doc it mentioned that the segmentation evaluation code is currently not part of the FFN repository. Is that code available somewhere?

syntax error in _required definition

Hi, I am suddenly getting a syntax error when I'm running an FFN inference script with the latest build. I think it originated from this commit: 3608a17

Here is the traceback:

[2019-11-02 16:06:06,460] {docker_operator.py:244} INFO - Traceback (most recent call last):
  File "run_inference.py", line 31, in <module>
    from ffn.inference import inference
  File "/ffn/ffn/inference/inference.py", line 38, in <module>
[2019-11-02 16:06:06,460] {docker_operator.py:244} INFO - from . import align
  File "/ffn/ffn/inference/align.py", line 22, in <module>
    from ..utils import bounding_box
  File "/ffn/ffn/utils/bounding_box.py", line 192
    def _required(bbox: Optional[BoundingBox]) -> BoundingBox:
                      ^
SyntaxError: invalid syntax

build_coordinates generating TFRecords so slowly

Why is 'build_coordinates' generating TFRecords so slowly? It takes 4 hours to create TFRecords for 500x500x200 images and 3 days for 2000x2000x200. Moreover, the performance utilization is very low; the server with an A100 GPU and the personal computer with an A2000 GPU have nearly identical speeds.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.