seung-lab / igneous Goto Github PK
View Code? Open in Web Editor NEWScalable Neuroglancer compatible Downsampling, Meshing, Skeletonizing, Contrast Normalization, Transfers and more.
License: GNU General Public License v3.0
Scalable Neuroglancer compatible Downsampling, Meshing, Skeletonizing, Contrast Normalization, Transfers and more.
License: GNU General Public License v3.0
In CloudVolume there is a compress
argument that allows output of brotli compressed files. However, I do not see such an argument in Igneous, and I always get gzip compressed files. Is there a way to get brotli compressed files?
Hi, igneous team
I see the igneous can downsample dataset
tasks = create_downsampling_tasks(
layer_path, # e.g. 'gs://bucket/dataset/layer'
mip=0, # Start downsampling from this mip level (writes to next level up)
fill_missing=False, # Ignore missing chunks and fill them with black
axis='z',
num_mips=5, # number of downsamples to produce. Downloaded shape is chunk_size * 2^num_mip
chunk_size=None, # manually set chunk size of next scales, overrides preserve_chunk_size
preserve_chunk_size=True, # use existing chunk size, don't halve to get more downsamples
sparse=False, # for sparse segmentation, allow inflation of pixels against background
bounds=None, # mip 0 bounding box to downsample
encoding=None # e.g. 'raw', 'compressed_segmentation', etc
delete_black_uploads=False, # issue a delete instead of uploading files containing all background
background_color=0, # Designates the background color
compress='gzip', # None, 'gzip', and 'br' (brotli) are options
)
I see it just downsample the dataset 2x2x1
, if the dataset's size is 256x256x256
it will get a 128x128x256
.
So how to get a 2x2x2
downsample, for example: the dataset's size is 256x256x256
it will get a 128x128x128
.
I just started thinking about the sharded format, and my initial test with the downsampling showed igneous supported reading the sharded format -- yay!. The downsampled output is unsharded, which is likely just fine (sharding is most important for mip=0).
Is it technically feasible to automatically determine sharding specs for higher mips and write out sharded format? For example, one could keep the same #minishard/chunk and #minishard/shard.
Now that Igneous has a usable CLI, it would be nice to be able to install it as a PyPI package. This is possible thanks to the fact that we moved all the heavy compiled code into other packages already.
I feel the organization of the Igneous code could use some work. Also, because I waited so long, someone else took the igneous PyPI name...
There is a syntax warning I encountered:
igneous/igneous/chunks.py:35: SyntaxWarning: "is not" with a literal. Did you mean "!="?
if (shape is None or dtype is None) and encoding is not 'npz':
Doesn't break anything at the moment but looks like a simple fix.
Hi, I am trying to use docker, but get some errors. I am probably not using it correctly, could you give me a hint?
~/workspace/zfish_analysis: docker run -it -v /secrets:/secrets -v /import:/import -e "LEASE_SECONDS=3000" seunglab/igneous:master
Deprecation Warning: /root/.cloudvolume/secrets/google-secret.json is now preferred to /secrets/google-secret.json.
Deprecation Warning: /root/.cloudvolume/secrets/aws-secret.json is now preferred to /secrets/aws-secret.json.
Deprecation Warning: /root/.cloudvolume/secrets/boss-secret.json is now preferred to /secrets/boss-secret.json.
Pulling from pull-queue://pull-queue
raised name 'leaseSecs' is not defined
Traceback (most recent call last):
File "/igneous/igneous/task_execution.py", line 70, in execute
task = tq.lease(tag=tag, seconds=int(LEASE_SECONDS))
File "/usr/local/lib/python3.4/site-packages/taskqueue/taskqueue.py", line 155, in lease
tag=tag,
File "/usr/local/lib/python3.4/site-packages/taskqueue/google_queue_api.py", line 110, in lease
'leaseSecs': leaseSecs,
NameError: name 'leaseSecs' is not defined
undefined task
on host 4d57a4214084
Traceback (most recent call last):
File "/igneous/igneous/task_execution.py", line 89, in <module>
command()
File "/usr/local/lib/python3.4/site-packages/click/core.py", line 722, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.4/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.4/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.4/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/igneous/igneous/task_execution.py", line 33, in command
execute(tag, queue, server, qurl)
File "/igneous/igneous/task_execution.py", line 70, in execute
task = tq.lease(tag=tag, seconds=int(LEASE_SECONDS))
File "/usr/local/lib/python3.4/site-packages/taskqueue/taskqueue.py", line 155, in lease
tag=tag,
File "/usr/local/lib/python3.4/site-packages/taskqueue/google_queue_api.py", line 110, in lease
'leaseSecs': leaseSecs,
NameError: name 'leaseSecs' is not defined
In zebrafish, the meshing takes a few days. Sometimes there is some tasks left and I have to rerun the whole thing to get it done.
Actually, we only need to mesh the reconstructed cells, meaning sparse meshing.
It would be nice to have this capability?
I installed the latest version of igneous and try to use the CLI to do simple job like downsampling, after I followed the instruction on the README page to create tasks and excute it, the log information would stop refreshing at some point showing something like "INFO FunctionTask 23a62836-1dbc-4350-9872-5dc8f6d06a96 succesfully executed in 18.43 sec." and never go back to shell, I can only exit the process by ctrl+c.
But I also tried to use the python interface as a simple script to do the same job, and it successfully finished.
I did my test on "CentOS Stream 8" and I installed the package using pip. Could you tell me if it is normal or what I should do to correctly use the CLI? Thanks a million.
I noticed that if the version of networkx
is not specified in requirements.txt an incompatible version will be installed. Can be fixed by specifying networkx==2.1
.
Is there a way to create multi-resolution meshes using igneous? The instructions on README.md seems to generate legacy single resolution meshes only.
just put this issue here.
Exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/pip/_internal/req/req_install.py", line 339, in check_if_exists
self.satisfied_by = pkg_resources.get_distribution(str(no_marker))
File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/__init__.py", line 476, in get_distribution
dist = get_provider(dist)
File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/__init__.py", line 352, in get_provider
return working_set.find(moduleOrReq) or require(str(moduleOrReq))[0]
File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/__init__.py", line 895, in require
needed = self.resolve(parse_requirements(requirements))
File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/__init__.py", line 786, in resolve
raise VersionConflict(dist, req).with_context(dependent_req)
pip._vendor.pkg_resources.ContextualVersionConflict: (intern 0.9.9 (/src/intern), Requirement.parse('intern>=0.9.10'), {'cloud-volume'})
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/pip/_internal/cli/base_command.py", line 143, in main
status = self.run(options, args)
File "/usr/local/lib/python3.5/dist-packages/pip/_internal/commands/install.py", line 318, in run
resolver.resolve(requirement_set)
File "/usr/local/lib/python3.5/dist-packages/pip/_internal/resolve.py", line 102, in resolve
self._resolve_one(requirement_set, req)
File "/usr/local/lib/python3.5/dist-packages/pip/_internal/resolve.py", line 256, in _resolve_one
abstract_dist = self._get_abstract_dist_for(req_to_install)
File "/usr/local/lib/python3.5/dist-packages/pip/_internal/resolve.py", line 193, in _get_abstract_dist_for
req, self.require_hashes, self.use_user_site, self.finder,
File "/usr/local/lib/python3.5/dist-packages/pip/_internal/operations/prepare.py", line 329, in prepare_editable_requirement
req.check_if_exists(use_user_site)
File "/usr/local/lib/python3.5/dist-packages/pip/_internal/req/req_install.py", line 350, in check_if_exists
self.req.name
File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/__init__.py", line 476, in get_distribution
dist = get_provider(dist)
File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/__init__.py", line 352, in get_provider
return working_set.find(moduleOrReq) or require(str(moduleOrReq))[0]
File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/__init__.py", line 895, in require
needed = self.resolve(parse_requirements(requirements))
File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/__init__.py", line 786, in resolve
raise VersionConflict(dist, req).with_context(dependent_req)
pip._vendor.pkg_resources.ContextualVersionConflict: (intern 0.9.9 (/src/intern), Requirement.parse('intern>=0.9.10'), {'cloud-volume'})
All the cool kids have a CLI for their task system.
Examples:
igneous downsample gs://bucket/test/image --mip 2 --sparse --queue sqs://my-sqs-queue
igneous xfer gs://bucket/test/image s3://bucket/test/image --mip 0 --queue /my/queue/dir
igneous delete gs://bucket/test/image --queue /my/queue/dir
igneous mesh create gs://bucket/test/segmentation --mip 3 --shape 511,511,511 --queue sqs://my-sqs-queue
igneous mesh merge gs://bucket/test/segmentation --magnitude 0 --queue sqs://my-sqs-queue
igneous skeleton create gs://bucket/test/segmentation --mip 3 --shape 511,511,511 --sharded
igneous skeleton merge gs://bucket/test/segmentation --mip 3 --shape 511,511,511
igneous execute sqs://my-sqs-queue --parallel 2
To live easily with certain batch job systems, task_execution should live for only a limited amount of time. We can always start new instances.
I use Igneous to extract the skeletons, but i got confused that if my files are not in the google cloud, but in my local system.
should I make an info
for each file? and the function create_skeletonizing_tasks
dose it process all the files at the same time? i was so confused about that. because my files were in local system, and did not know the file structure? could you give me some examples about how to use local files?
I think the cloudpath
will read all the files from google cloud, but how to read local files?This is the most confused.
The file structure like this:
├── seg_results
│ ├── 0
│ │ ├── 10
│ │ ├── 11
│ │ └── 12
│ ├── 1
│ │ ├── 10
│ │ ├── 11
│ │ ├── 12
│ │ └── 8
│ ├── 10
│ │ ├── 10
│ │ ├── 11
│ │ ├── 12
│ │ ├── 13
│ ├── 100
│ │ ├── 10
│ │ ├── 11
│ │ ├── 12
│ │ ├── 13
and i use the following code to process the files
cloudpath = 'file:///mnt/f/seg_results/0/10/'
mip = 0
# First Pass: Generate Skeletons
tasks = tc.create_skeletonizing_tasks(
cloudpath,
mip, # Which resolution to skeletionize at (near isotropic is often good)
shape=Vec(512, 512, 512), # size of individual skeletonizing tasks (not necessary to be chunk aligned)
sharded=False, # Generate (true) concatenated .frag files (False) single skeleton fragments
spatial_index=False, # Generate a spatial index so skeletons can be queried by bounding box
#info=None, # provide a cloudvolume info file if necessary (usually not)
info = CloudVolume.create_new_info(
num_channels = 1,
layer_type = 'segmentation',
data_type = 'uint64', # Channel images might be 'uint8'
encoding = 'raw', # raw, jpeg, compressed_segmentation, fpzip, kempressed
resolution = [4, 4, 40], # Voxel scaling, units are in nanometers
voxel_offset = [0, 0, 0], # x,y,z offset in voxels from the origin
mesh = 'mesh',
# Pick a convenient size for your underlying chunk representation
# Powers of two are recommended, doesn't need to cover image exactly
chunk_size = [ 128, 128, 64 ], # units are voxels
#volume_size = [ 250000, 250000, 25000 ], # e.g. a cubic millimeter dataset
volume_size = [125, 1250, 1250], # e.g. a cubic millimeter dataset
),
fill_missing=False, # Use zeros if part of the image is missing instead of raising an error
# see Kimimaro's documentation for the below parameters
teasar_params={'scale':10, 'const': 10},
object_ids=None, # Only skeletonize these ids
mask_ids=None, # Mask out these ids
fix_branching=True, # (True) higher quality branches at speed cost
fix_borders=True, # (True) Enable easy stitching of 1 voxel overlapping tasks
dust_threshold=1000, # Don't skeletonize below this physical distance
progress=False, # Show a progress bar
parallel=1, # Number of parallel processes to use (more useful locally)
)
but from couldpath
can only read a portion of files. should i change the cloudpath
orinfo
to access the whole files?
Hi,
I'm trying to set up igneous on macOS Catalina and having issues with both pre-build Docker and manual installation. Do you suggest any workaround for setting up igneous on Catalina?
$ docker run seunglab/igneous
/usr/local/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject
return f(*args, **kwds)
Traceback (most recent call last):
File "/igneous/igneous/task_execution.py", line 12, in <module>
from igneous import logger
File "/igneous/igneous/logger.py", line 13, in <module>
google_credentials_path, project=PROJECT_NAME)
File "/usr/local/lib/python3.7/site-packages/google/cloud/client.py", line 74, in from_service_account_json
with io.open(json_credentials_path, "r", encoding="utf-8") as json_fi:
FileNotFoundError: [Errno 2] No such file or directory: '/root/.cloudvolume/secrets/google-secret.json'
and kimimaro seems to fail during build when running pip install -r requirements.txt
.
Thank you,
manoaman
I am running our consensus building routing.
This is the command I am using:
sudo docker run -v /secrets:/secrets seunglab/
igneous:master bash -c ' export PIPELINE_USER_QUEUE=zfish; export QUEUE_TYPE=sqs; export SQS_UR
L=https://sqs.us-east-1.amazonaws.com/098703261575/zfish; export LEASE_SECONDS=600; alias pytho
n=python3; export LC_ALL=C.UTF-8; export LANG=C.UTF-8; python /igneous/igneous/task_execution.py
'
it used to work well, but I am getting some error now.
Pulling from sqs://https://sqs.us-east-1.amazonaws.com/098703261575/zfish
HyperSquareConsensusTask(src_path='gs://neuroglancer/zfish_v1/segmentation2',dest_path='gs://neu
roglancer/zfish_v1/consensus-20181125',ew_volume_id=28652,consensus_map_path='gs://neuroglancer/
zfish_v1/consensus-20181125/zfish_consensus.all.json',shape=[896, 896, 112],offset=[72960, 29952
, 16848])
Deprecation Warning: /root/.cloudvolume/secrets/google-secret.json is now preferred to /secrets/
google-secret.json.
raised a bytes-like object is required, not 'NoneType'
Traceback (most recent call last):
File "/igneous/igneous/task_execution.py", line 74, in execute
task.execute()
File "/igneous/igneous/tasks/tasks.py", line 646, in execute
consensus = cache(self, self.consensus_map_path).decode('utf8')
File "/igneous/igneous/tasks/tasks.py", line 92, in cache
f.write(filestr)
TypeError: a bytes-like object is required, not 'NoneType'
HyperSquareConsensusTask(src_path='gs://neuroglancer/zfish_v1/segmentation2',dest_path='gs://ne
uroglancer/zfish_v1/consensus-20181125',ew_volume_id=28652,consensus_map_path='gs://neuroglancer
/zfish_v1/consensus-20181125/zfish_consensus.all.json',shape=[896, 896, 112],offset=[72960, 2995
2, 16848])
on host 98906f76805c
Traceback (most recent call last):
File "/igneous/igneous/task_execution.py", line 94, in <module>
command()
File "/usr/local/lib/python3.6/site-packages/click/core.py", line 722, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.6/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.6/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/igneous/igneous/task_execution.py", line 34, in command
execute(tag, queue, server, qurl, loop)
File "/igneous/igneous/task_execution.py", line 74, in execute
task.execute()
File "/igneous/igneous/tasks/tasks.py", line 646, in execute
consensus = cache(self, self.consensus_map_path).decode('utf8')
File "/igneous/igneous/tasks/tasks.py", line 92, in cache
f.write(filestr)
TypeError: a bytes-like object is required, not 'NoneType'
Hi all, nice library
Recently I am using your library to extract neuron skeleton. But I am curious why the skeleton I extracted is broken and discontinuous, It shows below:
I change the skeleton to swc
file and show it using Vaa3d,
In the red box the skeleton is broken, but the actual neurons are intact, why does it happen? How to slove it?
create_mesh_tasks
now supports specifying a mesh_dir
to override the setting in the info
file (#27). But create_mesh_manifest_tasks
still only uses the info file, which makes this less useful for non-ChunkedGraph datasets.
On Ran's agglomeration test:
raised value too large to convert to unsigned int Traceback (most recent call last): File "/igneous/igneous/task_execution.py", line 60, in execute task.execute() File "/igneous/igneous/tasks.py", line 219, in execute self._compute_meshes() File "/igneous/igneous/tasks.py", line 224, in _compute_meshes self._mesher.mesh(data.flatten(), *data.shape[:3]) File "_mesher.pyx", line 34, in _mesher.Mesher.mesh File "stringsource", line 48, in vector.from_py.__pyx_convert_vector_from_py_unsigned_int OverflowError: value too large to convert to unsigned int MeshTask(shape=[512, 512, 512],offset=[2635, 2000, 522],layer_path='gs://neuroglancer/ranl/flyem_agglomeration_test',mip=3,simplification_factor=100,max_simplification_error=40) on host igneous-1996647607-q94q2
Necessary for Phase II.
You asked, you demanded, you got it.
For some use cases with data that are mostly black, it makes sense to pick non-zero values over zeroed values when downsampling. For data that consist of single points, this can be important. The existing downsampling code treats zero like any other value.
However, in my underground alchemic algorithms laboratory, I developed a new variant that can handle this case. We can integrate it as an option into igneous.
https://github.com/william-silversmith/countless/blob/master/python/countless2d.py#L78
version.py
requires the environment variable TRAVIS_BRANCH
. However, in many cases this variable is not set and users need to define it manually (and fill it with nonesense) to make igneous work.
https://github.com/seung-lab/igneous/blob/master/version.py#L11
This way we don't have to download everything to check what labels exist.
It would be good to add this as a CLI option too. We should automatically do this for downsampled images or transfers, but older datasets may need upgrades so it should be a separate task too.
Hi all,
thanks for this cool library.
I am generating skeltons from a local precomputed segmentation layer. And it worked after changing the function name to create_skeletonizing_tasks()
. (the doc says create_skeletonization_tasks()
)
However, in neuroglancer, the skeletons are not recognized / displayed. The info file points to the correct skeletons folder.
Any pointers what I am doing wrong?
Cheers,
Chris
I have a volume representing a stack of images read out from a microscope. There are 687 2D images each of which have dimension 2160 x 2560. So the resulting volume has dimension [2160,2560,687] in x,y,z. At the full resolution this volume loads too slow in Neuroglancer to be usable so I have turned to downsampling. A successful downsampling scheme I have used is the following:
mip = 0, factor = [2,2,1], resulting dimension = [1080,1280,687]
mip = 1, factor = [2,2,1], resulting dimension = [540,640,687]
mip = 2, factor = [2,2,1], resulting dimension = [270,320,687]
mip = 3, factor = [2,2,1], resulting dimension = [135,160,687]
I am using chunk_size=[128,128,64]
for all mip levels.
The code structure I am using to do the downsampling is:
mips = [0,1,2,3]
for mip in mips:
cv = CloudVolume(rechunked_cloudpath, mip)
chunks = calculate_chunks(downsample, mip) # uses the scheme mentioned above
factors = calculate_factors(downsample, mip) # uses the scheme mentioned above
tasks = tc.create_downsampling_tasks(cv.layer_cloudpath,
mip=mip,
num_mips=1,
factor=factors,
preserve_chunk_size=False,
compress=True,
chunk_size=chunks)
tq.insert(tasks)
tq.execute()
This works fine but I would like to downsample in z as well. Let's say I adopt the following downsampling scheme instead:
mip = 0, factor = [2,2,1], resulting dimension = [1080,1280,687]
mip = 1, factor = [2,2,1], resulting dimension = [540,640,687]
mip = 2, factor = [2,2,3], resulting dimension = [270,320,229]
mip = 3, factor = [2,2,1], resulting dimension = [135,160,229]
The only difference is the factor=[2,2,3]
in the mip=2 downsample, where I chose 3 because 687 is divisible by 3. The python code fails on this downsample level and gives me the following error:
Alignment Check:
Mip: 3
Chunk Size: [128 128 64]
Volume Offset: [0 0 0]
Received: Bbox([0, 0, 192],[128, 128, 229], dtype=int32)
Nearest Aligned: Bbox([0, 0, 192],[128, 128, 256], dtype=int32)
I don't understand what the problem is. It seems like there are some left over pixels in the z dimension so there is an incomplete chunk. But shouldn't this happen always unless your image size happens to be divisible by the chunk size? That's almost never the case. I didn't get this error when the z dimension was 687 for example. What is meant by "Aligned" in the error message?
I can strip off a z plane at the end to make the z dimension an even number but I'd ideally like to understand what is causing this issue.
Thanks,
Austin
Hi,
I've been using cloud-volume and igneous to display segmentation and meshes, served from a school server (e.g. link). My problem is that when I have millions of objects, the *.gz files generated by igneous in a single folder become really slow to be displayed. E.g., each single mesh description file takes long time to access link.
Wonder if there is any way to solve this issue?
Thanks again for the amazing tool and support!
Donglai
I need to use some functions from igneous, it would be easier to make travis tests if igneous was registered.
the QUEUE_NAME is not correct when we use SQS?
I always get the taskQueueName: pull_queue
eventhough I am using sqs. This is not important but would be nice to fix it.
Hi,
We are running this pipeline on a 20.04 ubuntu host and when when this pipline runs, it begins to consume ram and continues to consume it until all available system memory has been used up and the process then terminates. We have tried adding more ram and cpu to the machine but the outcome is the same and the ram eventually gets used up by python processes that are using multiple gb of ram each. Any idea why this might be happening? Appreciate the help
Are the recent github code and dockerhub versions compatible? I was browsing the commits of the last few months, and don't notice anything that looks breaking.
I am updating dependencies in my project, and ran into a conflict with oauth2client being pinned in the older version of igneous I was using. My client code isn't fancy -- it's submitting transfer tasks with downsampling to an AWS queue. My igneous cluster deploys an older igneous docker image, which reads from the queue.
Previously I had pinned:
github commit: 2e1db31f60331420f72c958cedb7932a84fe6ef (2020/09/02)
dockerhub sha256: b359fce8e5b3e5061d6b4800fd61cd2b9b9c8c10e2b5f87f11984d1a3ee7cdfb (cannot locate)
Currently, the most recent versions are:
github commit: 823c9b1 (2021/05/15)
dockerhub sha256: 896dd8db6d7d3bf53bbe12623653eb6d4eb485fcb632ef5c426caa65b25d3ad3 (4 months ago)
I don't know how to tell at which the dockerhub image was built. Are there any incompatibilities to be aware of if I jump to the newest versions?
Thanks in advance!
The travis config seems misconfigured and causes the most recent PR to be pushed to Docker's seunglab/igneous:master 😱
Hi, nice library, recently i use this library to extract skeleton
i have download the google segmentation for FAFB datasets in my local computer.
but when i use the following code, there are something wrong:
cloudpath1 = 'file:///mnt/d/braindata/google_segmentation/google_256.0x256.0x320.0/'
mip = 0
# First Pass: Generate Skeletons
tasks1 = tc.create_skeletonizing_tasks(
cloudpath1,
mip, # Which resolution to skeletionize at (near isotropic is often good)
shape=Vec(64, 64, 32), # size of individual skeletonizing tasks (not necessary to be chunk aligned)
sharded=False, # Generate (true) concatenated .frag files (False) single skeleton fragments
spatial_index=False, # Generate a spatial index so skeletons can be queried by bounding box
info=None, # provide a cloudvolume info file if necessary (usually not)
fill_missing=False, # Use zeros if part of the image is missing instead of raising an error
# see Kimimaro's documentation for the below parameters
teasar_params={'scale':10, 'const': 50},
object_ids=None, # Only skeletonize these ids
mask_ids=None, # Mask out these ids
fix_branching=True, # (True) higher quality branches at speed cost
fix_borders=True, # (True) Enable easy stitching of 1 voxel overlapping tasks
dust_threshold=1000, # Don't skeletonize below this physical distance
progress=False, # Show a progress bar
parallel=1, # Number of parallel processes to use (more useful locally)
)
tq = MockTaskQueue()
tq.insert_all(tasks1)
the output as following:
Connected Components Error: Label 34856 cannot be mapped to union-find array of length 34856.
and i saw the google segmentation results is encode by Hex, should i change them to decimal?
and i also change the info file like the following:
{
"@type": "neuroglancer_multiscale_volume",
"data_type": "uint64",
"mesh": "mesh",
"num_channels": 1,
"scales": [
{
"chunk_sizes": [
[
64,
64,
64
]
],
"compressed_segmentation_block_size": [
8,
8,
8
],
"encoding": "compressed_segmentation",
"key": "512.0x512.0x640.0",
"resolution": [
512,
512,
640
],
"sharding": {
"@type": "neuroglancer_uint64_sharded_v1",
"data_encoding": "gzip",
"hash": "identity",
"minishard_bits": 4,
"minishard_index_encoding": "gzip",
"preshift_bits": 9,
"shard_bits": 0
},
"size": [
1944,
1048,
442
],
"voxel_offset": [
0,
0,
0
]
}
],
"skeletons": "skeletons_mip_0",
"type": "segmentation"
}
Hi,
I am trying to serve precomputed skeleton data in a Neuroglancer viewer instance. I have followed the steps mentioned in the precomputed skeleton guidelines and taken inspiration from Igneous' SkeletonTask
.
Here is how the info file looks like:
{
"@type": "neuroglancer_skeletons",
"transform": [
1,
0,
0,
0,
0,
1,
0,
0,
0,
0,
1,
0
],
"vertex_attributes": [],
"sharding": None,
"spatial_index": None
}
In my case, there are no vertex attributes that have to be specified. The encoded binary data is served through a <segment-id>
endpoint like this:
data = [
np.uint32(num_vertices), # N
np.uint32(num_edges), # M
vertex_positions, # (N,3) float32 array
edges # (M,2) uint32 array
]
encoded_skeleton = b''.join([array.tobytes('C') for array in data])
Both the info and binary data are served through the endpoint "[HOST]:[PORT]/skeletons/"
. The info file is fetched alright but the encoded skeleton endpoint is never hit. I have followed the same approach for the meshes and they are rendered fine! Just to be sure that the skeletons are correct, neuroglancer.skeleton.SkeletonSource
works well with my skeletons.
Thanks,
Hashir
This is likely something that can be mentioned in the documentation somewhere, and I'm happy to do it. I'm thinking ahead to a situation with a k8s pod auto-scaling on top of an auto-scaling node pool with pre-emptible instances.
If I use pre-emptible nodes, which are much cheaper, will I run into any issues dropping messages as nodes come and go?
Would it work to put a horizontal pod scaler, like the following, in the deployment yaml? If so, what is a decent target CPU, based on prior experience?
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: igneous
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: igneous
minReplicas: 1
maxReplicas: 320
targetCPUUtilizationPercentage: 80
The prefix strategy may not be necessary when there are few enough meshes. A simple directory scan would be enough. Maybe this could be an option or another task creation function?
I tried to downsample the volume 2x2x2 with a very simple Python script:
from taskqueue import LocalTaskQueue
import igneous.task_creation as tc
src_layer_path = 'file://output'
dest_layer_path = 'file://output2'
with LocalTaskQueue(parallel=8) as tq:
tasks = tc.create_transfer_tasks(
src_layer_path, dest_layer_path,
chunk_size=(64,64,16), skip_downsamples=True, compress='gzip'
)
tq.insert_all(tasks)
tasks = tc.create_downsampling_tasks(
dest_layer_path, factor=(2,2,2), compress='gzip', num_mips=1
)
tq.insert_all(tasks)
print("Done!")
However, after the task finished I have a 4_4_4 folder (mip 0) and an 8_8_4 folder (mip 1), although the contents in the 8_8_4 folder are indeed 2x2x2 downsampled so just the folder name is wrong. In addition, the info
file generated is also incorrect:
{
"data_type": "uint8",
"num_channels": 1,
"scales": [
{
"chunk_sizes": [
[
64,
64,
16
]
],
"encoding": "raw",
"key": "4_4_4",
"resolution": [
4,
4,
4
],
"size": [
412,
914,
800
],
"voxel_offset": [
0,
0,
0
]
},
{
"chunk_sizes": [
[
64,
64,
16
]
],
"encoding": "raw",
"key": "8_8_4",
"resolution": [
8,
8,
4
],
"size": [
206,
457,
800
],
"voxel_offset": [
0,
0,
0
]
},
{
"chunk_sizes": [
[
64,
64,
16
]
],
"encoding": "raw",
"key": "16_16_4",
"resolution": [
16,
16,
4
],
"size": [
103,
229,
800
],
"voxel_offset": [
0,
0,
0
]
},
{
"chunk_sizes": [
[
64,
64,
16
]
],
"encoding": "raw",
"key": "8_8_8",
"resolution": [
8,
8,
8
],
"size": [
206,
457,
400
],
"voxel_offset": [
0,
0,
0
]
}
],
"type": "image"
}
As it contains many non-existent layers. After manually correcting the folder name to 8_8_8 and fixing the info
file, the volume appears to be correct and can be visualized with Neuroglancer. All of my packages (Igneous, CloudVolume, etc.) are up-to-date as of right now.
Also contrast normalization tasks.
We could conceivably add a third pass to the skeleton construction by using much higher mip levels to find the somata with a large enough context that it's nbd. We would then find a way to merge the resulting good skeletons with the mess that results from many smaller fields of view. @jabae had some thoughts about this.
Fixing the pinky100 skeletonization would be a great target.
\"/app/src/igneous/igneous/tasks/skeletonization.py\", line 18, in <module>"}
{"source":"unknown","time":"--/app/T::uwsgi.ini.000Z","severity":"error","message":" import kimimaro"}
{"source":"unknown","time":"--/app/T::uwsgi.ini.000Z","severity":"error","message":" File \"/usr/local/lib/python3.6/site-packages/kimimaro/__init__.py\", line 19, in <module>"}
{"source":"unknown","time":"--/app/T::uwsgi.ini.000Z","severity":"error","message":" from .postprocess import postprocess, join_close_components"}
{"source":"unknown","time":"--/app/T::uwsgi.ini.000Z","severity":"error","message":" File \"/usr/local/lib/python3.6/site-packages/kimimaro/postprocess.py\", line 36, in <module>"}
{"source":"unknown","time":"--/app/T::uwsgi.ini.000Z","severity":"error","message":" from cloudvolume import Skeleton, Bbox"}
{"source":"unknown","time":"--/app/T::uwsgi.ini.000Z","severity":"error","message":"ImportError: cannot import name 'Skeleton'"}
{"source":"unknown","time":"-```
Sometimes, I would like to change parameters at the middle of a run. For example, I would like to switch to 4 core machines from 2 core machines and I need to change the parameter of network file, but the parameter was hard coded in the tasks.
It would be nice to have this capability. It is not urgent though.
I have one application that installs/import igneous just to get to the igneous.task_creation.create_transfer_tasks method. A separate cluster, based on the igneous Docker image, is responsible for handling tasks from several applications and performing transfer/downsampling for viewers.
In your README.md, you mention being amenable to breaking out pieces of code as separate libraries. Would the task handling code be a candidate for being a separate library?
I'd like to shed the dependencies on libraries like kimimaro/tinybrain/zmesh.
e.g. [4,4,0.4] breaks downsampling
We are planning to manually segment some data, and to reduce the amount of time for reconstruction we are going to segment a downsampled volume. The workflow I'm currently thinking is:
Raw EM data (4x4x4 nm) -> downsample to 16x16x16 nm -> segment at 16x16x16 nm -> upscale segmentation to 4x4x4 nm w/ nearest neighbor -> generate meshes at 4x4x4 nm -> combine upscaled segmentation and meshes with raw EM data and visualize in Neuroglancer
I'm hoping to use Igneous for many of these steps, and I have a few questions:
I saw that a new option factor
is being added to the downsampling function. Does that mean I can directly downsample from 4x4x4 to 16x16x16 for the raw EM data?
I don't think I saw an upscaling function in Igneous, could you confirm that there isn't such a function?
Is it possible to directly generate meshes with 16x16x16 segmentation data and display them properly in the 4x4x4 dataset? (instead of upscaling and then generating meshes at 4x4x4)
Thank you very much for your help!
Hi Will, I've had trouble installing igneous's prerequisites – I'm sorry to bother you with something that has its roots in non-Seung lab code, but Tommy recommended I post an issue here and that you might be interested in it and able to help. Specifically pip install deflate
fails on Ubuntu 16.04.4 and CentOS Linux release 7.7.1908 (but succeeds on MacOS 10.13.6 High Sierra). The full output is below. (Here I show it as resulting from pip install deflate
but it's the same thing when trying to pip install -r requirements.txt
from a cloned igneous folder.)
Collecting deflate
Downloading deflate-0.1.0.tar.gz (140 kB)
|████████████████████████████████| 140 kB 6.2 MB/s
Building wheels for collected packages: deflate
Building wheel for deflate (setup.py) ... error
ERROR: Command errored out with exit status 1:
command: /home/jtm23/.virtualenvs/igneous/bin/python3.7 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-29a0i252/deflate/setup.py'"'"'; __file__='"'"'/tmp/pip-install-29a0i252/deflate/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-tzd1qja3
cwd: /tmp/pip-install-29a0i252/deflate/
Complete output (46 lines):
running bdist_wheel
running build
running build_ext
CC lib/deflate_decompress.o
CC lib/utils.o
CC lib/arm/cpu_features.o
CC lib/x86/cpu_features.o
CC lib/deflate_compress.o
CC lib/adler32.o
CC lib/zlib_decompress.o
CC lib/zlib_compress.o
CC lib/crc32.o
CC lib/gzip_decompress.o
CC lib/gzip_compress.o
AR libdeflate.a
CC lib/deflate_decompress.shlib.o
CC lib/utils.shlib.o
CC lib/arm/cpu_features.shlib.o
CC lib/x86/cpu_features.shlib.o
CC lib/deflate_compress.shlib.o
CC lib/adler32.shlib.o
CC lib/zlib_decompress.shlib.o
CC lib/zlib_compress.shlib.o
CC lib/crc32.shlib.o
CC lib/gzip_decompress.shlib.o
CC lib/gzip_compress.shlib.o
CCLD libdeflate.so.0
LN libdeflate.so
GEN programs/config.h
CC programs/gzip.o
CC programs/prog_util.o
CC programs/tgetopt.o
CCLD gzip
LN gunzip
building 'deflate' extension
creating build
creating build/temp.linux-x86_64-3.7
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -I/n/app/libffi/3.2.1/lib/libffi-3.2.1/include -I/n/app/libffi/3.2.1/lib/libffi-3.2.1/include -fPIC -I/n/app/python/3.7.4/include/python3.7m -c deflate.c -o build/temp.linux-x86_64-3.7/deflate.o
creating build/lib.linux-x86_64-3.7
gcc -pthread -shared -L/n/app/libffi/3.2.1/lib64 -L/n/app/libffi/3.2.1/lib64 build/temp.linux-x86_64-3.7/deflate.o libdeflate/libdeflate.a -L/n/app/python/3.7.4/lib -lpython3.7m -o build/lib.linux-x86_64-3.7/deflate.cpython-37m-x86_64-linux-gnu.so
/usr/bin/ld: libdeflate/libdeflate.a(deflate_decompress.o): relocation R_X86_64_32 against `.rodata' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: libdeflate/libdeflate.a(deflate_compress.o): relocation R_X86_64_32 against `.rodata' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: libdeflate/libdeflate.a(crc32.o): relocation R_X86_64_32S against `.rodata' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Nonrepresentable section on output
collect2: error: ld returned 1 exit status
error: command 'gcc' failed with exit status 1
----------------------------------------
ERROR: Failed building wheel for deflate
Running setup.py clean for deflate
Failed to build deflate
Installing collected packages: deflate
Running setup.py install for deflate ... error
ERROR: Command errored out with exit status 1:
command: /home/jtm23/.virtualenvs/igneous/bin/python3.7 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-29a0i252/deflate/setup.py'"'"'; __file__='"'"'/tmp/pip-install-29a0i252/deflate/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-gyn3qb3j/install-record.txt --single-version-externally-managed --compile --install-headers /home/jtm23/.virtualenvs/igneous/include/site/python3.7/deflate
cwd: /tmp/pip-install-29a0i252/deflate/
Complete output (15 lines):
running install
running build
running build_ext
building 'deflate' extension
creating build
creating build/temp.linux-x86_64-3.7
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -I/n/app/libffi/3.2.1/lib/libffi-3.2.1/include -I/n/app/libffi/3.2.1/lib/libffi-3.2.1/include -fPIC -I/n/app/python/3.7.4/include/python3.7m -c deflate.c -o build/temp.linux-x86_64-3.7/deflate.o
creating build/lib.linux-x86_64-3.7
gcc -pthread -shared -L/n/app/libffi/3.2.1/lib64 -L/n/app/libffi/3.2.1/lib64 build/temp.linux-x86_64-3.7/deflate.o libdeflate/libdeflate.a -L/n/app/python/3.7.4/lib -lpython3.7m -o build/lib.linux-x86_64-3.7/deflate.cpython-37m-x86_64-linux-gnu.so
/usr/bin/ld: libdeflate/libdeflate.a(deflate_decompress.o): relocation R_X86_64_32 against `.rodata' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: libdeflate/libdeflate.a(deflate_compress.o): relocation R_X86_64_32 against `.rodata' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: libdeflate/libdeflate.a(crc32.o): relocation R_X86_64_32S against `.rodata' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Nonrepresentable section on output
collect2: error: ld returned 1 exit status
error: command 'gcc' failed with exit status 1
----------------------------------------
ERROR: Command errored out with exit status 1: /home/jtm23/.virtualenvs/igneous/bin/python3.7 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-29a0i252/deflate/setup.py'"'"'; __file__='"'"'/tmp/pip-install-29a0i252/deflate/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-gyn3qb3j/install-record.txt --single-version-externally-managed --compile --install-headers /home/jtm23/.virtualenvs/igneous/include/site/python3.7/deflate Check the logs for full command output.
Looks like deflate tries to build libdeflate during installation. I tried cloning libdeflate and make
ing it, and that succeeded just fine. So it seems like an issue with how the deflate package tries to link the libdeflate c libraries. I don't know much about c/c++ or compilers so I'm a bit useless here, but I tried adding both extra_compile_args=["-fPIC"]
and extra_link_args=["-fPIC"]
to deflate/setup.py and then running python setup.py install
on that updated script. That succeeded in inserting an -fPIC
argument into the gcc
command that gets run during install, but didn't solve the problem and the same error message was output. That's as far as I got. Do you have any idea how to get past this install issue? Would love to start running igneous tasks but this is holding me up.
edited
In playing around with the sharded format, I had a unit test fail when igneous downsampling a certain volume shape. I tracked it down Cloudvolume fails reading a tensorstore generated volume.
The following demonstrates the issue on a scaled down volume.
import numpy as np
import cloudvolume
import tensorstore as ts
# Base array with ratios 4:4:1.
arr = np.arange(128, dtype=np.uint16).reshape((8, 8, 2, 1))
chunk_size = (2,2,2)
# Single shard with single minishard
sharding = {
"@type": "neuroglancer_uint64_sharded_v1",
"preshift_bits": 4,
"hash": "identity",
"minishard_bits": 0,
"shard_bits": 0,
"minishard_index_encoding": "gzip",
"data_encoding": "gzip",
}
# tensorstore written in sharded format
spec = {
"driver": "neuroglancer_precomputed",
"kvstore": {
"driver": "file",
"path": "/tmp/tensorstore",
},
"multiscale_metadata": {
"type": "image",
"data_type": arr.dtype.name,
"num_channels": arr.shape[3],
},
"scale_metadata": {
"size": arr.shape[:3],
"chunk_size": chunk_size,
"resolution": (1,1,1),
"encoding": "raw",
"sharding": sharding,
},
"create": True,
"delete_existing": True,
}
dataset_future = ts.open(spec)
ds = dataset_future.result()
ds[:] = arr
vol = cloudvolume.CloudVolume("file:///tmp/tensorstore")
vol[()]
EmptyVolumeException Traceback (most recent call last)
<ipython-input-6-b02c9221f2ec> in <module>
47
48 vol = cloudvolume.CloudVolume("file:///tmp/tensorstore")
---> 49 vol[()]
~/.local/share/virtualenvs/starmap-T47byR32/lib/python3.8/site-packages/cloudvolume/frontends/precomputed.py in __getitem__(self, slices)
527 requested_bbox = Bbox.from_slices(slices)
528
--> 529 img = self.download(requested_bbox, self.mip)
530 return img[::steps.x, ::steps.y, ::steps.z, channel_slice]
531
~/.local/share/virtualenvs/starmap-T47byR32/lib/python3.8/site-packages/cloudvolume/frontends/precomputed.py in download(self, bbox, mip, parallel, segids, preserve_zeros, agglomerate, timestamp, stop_layer, renumber)
575 parallel = self.parallel
576
--> 577 tup = self.image.download(bbox, mip, parallel=parallel, renumber=bool(renumber))
578 if renumber:
579 img, remap = tup
~/.local/share/virtualenvs/starmap-T47byR32/lib/python3.8/site-packages/cloudvolume/datasource/precomputed/image/__init__.py in download(self, bbox, mip, parallel, location, retain, use_shared_memory, use_file, order, renumber)
149
150 spec = sharding.ShardingSpecification.from_dict(scale['sharding'])
--> 151 return rx.download_sharded(
152 bbox, mip,
153 self.meta, self.cache, spec,
~/.local/share/virtualenvs/starmap-T47byR32/lib/python3.8/site-packages/cloudvolume/datasource/precomputed/image/rx.py in download_sharded(requested_bbox, mip, meta, cache, spec, compress, progress, fill_missing, order)
76 chunkdata = None
77 else:
---> 78 raise EmptyVolumeException(cutout_bbox)
79
80 img3d = decode(
EmptyVolumeException: Bbox([0, 4, 0],[2, 6, 2], dtype=int32)
the user email should change by the user? it is hard coded now. I am not sure how to change this though.
https://github.com/seung-lab/igneous/blob/master/igneous/task_creation.py#L31
oauth2client
has been deprecated in favor of google-auth
and should be upgraded.
I am trying to run GPU inference for mitochondria map, but get some errors, any idea? @william-silversmith
I have tried to upgrade taskqueue, but still have this problem. I am using the chunkflow
branch though. do I need to merge the master?
(jwu) ~/workspace/igneous/igneous$ export SQS_URL=https://sqs.us-east-1.amazonaws.com/098703261575/jwu-igneous; export LEASE_SECONDS=3600; python task_execution.py --qurl https://sqs.us-east-1.amazonaws.com/098703261575/jwu-igneous --loop
Pulling from pull-queue://https://sqs.us-east-1.amazonaws.com/098703261575/jwu-igneous
raised 'NotImplementedError' object has no attribute 'lease'
Traceback (most recent call last):
File "task_execution.py", line 71, in execute
task = tq.lease(tag=tag, seconds=int(LEASE_SECONDS))
File "/usr/people/jingpeng/workspace/igneous/jwu/lib/python3.5/site-packages/taskqueue/taskqueue.py", line 152, in lease
tasks = self._api.lease(
AttributeError: 'NotImplementedError' object has no attribute 'lease'
undefined task
on host seungworkstation20
Traceback (most recent call last):
File "task_execution.py", line 94, in <module>
command()
File "/usr/people/jingpeng/workspace/igneous/jwu/lib/python3.5/site-packages/click/core.py", line 722, in __call__
return self.main(*args, **kwargs)
File "/usr/people/jingpeng/workspace/igneous/jwu/lib/python3.5/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/usr/people/jingpeng/workspace/igneous/jwu/lib/python3.5/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/people/jingpeng/workspace/igneous/jwu/lib/python3.5/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "task_execution.py", line 34, in command
execute(tag, queue, server, qurl, loop)
File "task_execution.py", line 71, in execute
task = tq.lease(tag=tag, seconds=int(LEASE_SECONDS))
File "/usr/people/jingpeng/workspace/igneous/jwu/lib/python3.5/site-packages/taskqueue/taskqueue.py", line 152, in lease
tasks = self._api.lease(
AttributeError: 'NotImplementedError' object has no attribute 'lease'
With the current behavior, the delete task just keeps following the redirect key specified in the info file. That's definitely bad - if you want to delete a dataset, the path provided should be unambiguous. And it prevents deleting the dataset that's masked by the redirect. (We do have such datasets)
Opening this issue to determine what the best approach is.
I would just pass max_redirects=0
for create_deletion_task
as well as DeleteTask.execute()
, but maybe there is a better way?
in the dockerfile, all the boost libraries were installed, which is pretty big.
which library needs this?
maybe we can downsize to a few boost package?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.