bioroboticslab / bb_binary Goto Github PK
View Code? Open in Web Editor NEWBinary Format for the BeesBook detections
License: Apache License 2.0
Binary Format for the BeesBook detections
License: Apache License 2.0
Maybe change the docs, but it is just not sorted if cam=None
.
It is only sorted if cam
has some value, otherwise it is ordered by camId
.
Not consistent with documentation.
The beesbook binary schema has now a more serious form.
It is time to review and criticize it.
You can read it here: https://github.com/BioroboticsLab/bb_binary/blob/master/bb_binary/bb_binary_schema.capnp
Even if you are not familiar with cap'n proto, it should be somehow readable.
Please read the schema with your particular use case in mind and check if it can fulfill it.
Some questions:
Frame
and FrameContainer
structs. Does this make sense? How are global ids generated in practice?tagIdx
is counted from 0 for every new frame. Would a global unique id be useful for tracking?DetectionCVP
for detections from the computer visison pipeline and DetectionDP
for detections from the deep pipeline?For 2016 we are using a new timestamp format which is compliant to ISO 8601.
The new format is: Cam_#ID#_YYYY-MM-DDThh:mm:ss.iiiiii--YYYY-MM-DDThh:mm:ss.iiiiii.#Extension#
Eg.: Cam_0_2016-07-11T17:49:58.256603--2016-07-11T17:49:58.256603.mkv
bb_binary needs to be adjusted to be able to read this format. Maybe add some auto-detect?
Printing a FrameContainer in a jupyter notebook kills it. Change the __repr__
function so it shortens the output before printing it.
It would be helpful if there would be a function iter_frames(start=None, end=None, cam=None)
analogous to iter_fnames.
This function would iterate through frames in a given time range.
Start is inclusive, end is exclusive.
No start or end given: behavior like iter_fnames
.
At the moment the python version bb_binary will not work with windows because it relies on pycapnp.
There are several issues regarding this problem:
Current workarounds in the Biorobotics Lab are:
Nonetheless we should watch the development of the windows related pycapnp issues and keep track of other/better workarounds.
When folders like 00
, 20
, and 40
are missing (due to maintenance or whatever else), a AssertionError is raised.
Maybe, that's because it is somehow searching for the next directory until it reaches the end of the repositories folder structure.
I don't knot what exactly is happening, but it somehow iterates all the folders, until it reaches the end.
In commit ac16fd8, I introduced a script to convert gt to hdf5 file
that also contain tag images. We should move it to another repository, that holds also other utility stuff.
If iterating over a repository with iter_frames
using begin
and end
and both parameters are not within the repository limits it raises an AssertionError
. I would assume to get something returned which is empty, or a nice hint about being outside of the repository limits.
The Repository class has the following use cases:
Let's assume a user wants to read data from an existing repository but requests a wrong path. The Repository Class assumes the user wants to create a new repository and creates it for him. Metaphorically speaking the user is now "waiting" for data from the repository and the repository is "expecting" data from the user. I suspect their needs will not be satisfied :(
The same applies to the second use case: a user wants to add data to an existing repository but accidentally creates a new repository...
To avoid this situation it might be helpful if you could tell the Repository class what you are planning to do. The Repository class will then do it, or tell you that it can't.
Iterating over frames while using the begin and end parameter may throw an error. The reason for the error could not be found.
I used one hour following data repo = Repository('data/1h_09-25-15/')
code:
repo = Repository('data/1h_09-25-15/')
frames = repo.iter_frames(begin=ts1, end=ts2)
list(frames)
error
-----------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-6-e04f7628f3b5> in <module>()
----> 1 list(frames)
/opt/anaconda2/envs/bee3/lib/python3.5/site-packages/bb_binary/repository.py in iter_frames(self, begin, end, cam)
212 FrameContainer (FrameContainer): the corresponding FrameContainer for each frame.
213 """
--> 214 for f in self.iter_fnames(begin=begin, end=end, cam=cam):
215 fc = load_frame_container(f)
216 for frame in fc.frames:
/opt/anaconda2/envs/bee3/lib/python3.5/site-packages/bb_binary/repository.py in iter_fnames(self, begin, end, cam)
191 else:
192 current_path = self._step_to_next_directory(
--> 193 current_path, direction='forward')
194 current_path = self._join_with_repo_dir(current_path)
195
/opt/anaconda2/envs/bee3/lib/python3.5/site-packages/bb_binary/repository.py in _step_to_next_directory(self, path, direction)
306 path_dt = self._get_time_from_path(path)
307 end_dt = self._get_time_from_path(end)
--> 308 assert path_dt < end_dt
309
310 elif direction == 'backward':
AssertionError:
The c++ converter tools need to be adapted to the resent changes.
What exactly do the different fields mean in the real setup? E.g. which axes of the hive are x, y, z (upward, right, normal/perpendicular)?
This can lower the introductory threshold for new members.
The x and y coordinates are currently swapped. This leads to a lot of weird code where we always have to correct this mistake.
I would propose to fix this mistake, I previously assumed that this was intended. Recent developement revealed that this is not so.
Examples of weird code:
ax.quiver(y, x, ...)]
simpledef rotate_direction_vec(rotation):
x, y = 0, 10
sined = np.sin(rotation)
cosined = np.cos(rotation)
normed_x = x*cosined - y*sined
normed_y = x*sined + y*cosined
return [np.around(normed_x, decimals=2), np.around(normed_y, decimals=2)]
We should provide the opposite of the int_id_to_binary().
Possible signature:
def binary_id_to_int(decoded_id, threshold=0.5, big_endian=True):
"""Helper to convert an id represented as bit array to an integer.
Arguments:
decoded_id (:obj:`list` of int or float): id as bit array
Keyword Arguments:
threshold (Optional float): `decoded_id` values >= threshold are interpreted as 1
big_endian (Optional bool): assuming `decoded_id` is formated using big-endian
if set to :obj:`True`, or little-endian if set to :obj:`False`
Returns:
int: the decoded id represented as integer
"""
pass
Please note the different endianess of decodedId in DectionDP and int_id_to_binary()!
The method Repository.iter_fnames() accepts the keyword arguments begin
and end
. When begin
and end
are in the same interval, the method will yield nothing. Or to use the illustration from the docstring:
Files: A B
Frames: |---------------|---------------|
⬆ ⬆
begin end
Instead of A
no filename is yielded.
Current schema does only partially support stitched data. The orientation is besides hiveX and hiveY needed.
Suggestion: Put struct Stitched
into DetectionsDP
and DetectionsTruth
. See below for implementation.
I also suggest removing hiveX and hiveY completely.
struct Stitched{
xpos @1 :UInt16; # x coordinate of the grid center wrt. the comb
ypos @2 :UInt16; # y coordinate of the grid center wrt. the comb
zRotation @5 :Float32; # rotation of the grid in z plane
yRotation @6 :Float32; # rotation of the grid in y plane
xRotation @7 :Float32; # rotation of the grid in x plane
}
struct DetectionsDP{
idx @0 :UInt16; # sequential index of the detection, counted from 0 for every frame
# the combination (idx, Frame.id) is a global key
xpos @1 :UInt16; # x coordinate of the grid center wrt. the image
ypos @2 :UInt16; # y coordinate of the grid center wrt. the image
zRotation @5 :Float32; # rotation of the grid in z plane
yRotation @6 :Float32; # rotation of the grid in y plane
xRotation @7 :Float32; # rotation of the grid in x plane
radius @8 :Float32; # radius of the tag
localizerSaliency @9 :Float32; # saliency of the localizer network
decodedId @10 :List(UInt8); # the decoded id, the bit probabilities are discretised to 0-255.
# p(first bit == 1) = decodedId[0] / 255
descriptor @11 :List(UInt8); # visual descriptor of the detection. ordered from most
# significant eight bits to least significant eight bits.
stitched @8 :Stitched; # the stitched data, optional
}
# Corresponds to an image in the video.
struct Frame {
id @0 :UInt64; # global unique id of the frame
dataSourceIdx @1:UInt32; # the frame is from this data source
frameIdx @6 :UInt32; # sequential increasing index for every data source.
timestamp @2 :Float64; # unix time stamp of the frame
timedelta @3 :UInt32; # time difference between this frame and the frame before in microseconds
detectionsUnion : union {
detectionsCVP @4 :List(DetectionCVP); # detections format of the old computer vision pipeline
detectionsDP @5 :List(DetectionDP); # detections format of the new deeppipeline
detectionsTruth @7 :List(DetectionTruth); # detections format of ground truth data
}
}
This should be done in conjunction with #15
It is currently impossible to determine the capnp schema version and needed code-version.
Add a new field to FrameContainer
which allows to see what version the schema is on:
struct FrameContainer {
id @0 :UInt64; # global unique id of the frame container
dataSources @1:List(DataSource); # list of data sources (videos / images)
fromTimestamp @2 :Float64; # unix timestamp of the first frame
toTimestamp @3 :Float64; # unix timestamp of the last frame
frames @4 :List(Frame); # frames must be sorted by in the order they where recorded.
camId @5 :UInt16; # the cam number
hiveId @6 :UInt16; # the id of the hive
transformationMatrix @7 :List(Float32);
# the transformation matrix from image coordinates to hive coordinates.
# The matrix is of dimension 4x4 and stored this way
# 1 | 2 | 3 | 4
# 5 | 6
# ...
# 15| 16
schema_version: @8 :UInt16; # the version of the schema
}
I also propose to tag each version the code (this repo) with following schema:
v[schema_version].[minor_version]
schema_version
is the capnp-version the code is supporting
minor_version
is incremented when code changes are happening which do not break the schema
This allows our legacy projects to be still supported and even get some necessary updates. Old projects will otherwise break on every change.
When using parameters like start and end it should be documented and tested whether they are inclusive or exclusive.
Problem found in iter_fnames
When using start
and end
as arguments for iter_frames
, and the time period is kind of small, then nothing is returned. I am not sure if it should behave like this (?), and its not an issue. But I thought maybe it is not intuitive. I guess iter_fnames
is not returning anything, but should (?).
If I ask for 10 seconds of frames, I would assume getting about 30 frames for each camera, but thats not the case.
path = "../00_Data/testset_2015_20m/2015082215/"
repo = Repository(path)
min = sys.maxsize
for frame, fc in repo.iter_frames():
if (frame.timestamp < min):
min = frame.timestamp
print(min)
print(datetime.datetime.fromtimestamp(min))
s = min + 1
e = s + 60
print(datetime.datetime.fromtimestamp(s))
print(datetime.datetime.fromtimestamp(e))
count = 0
for frame, fc in repo.iter_frames(begin=s, end=e, cam=None):
count += 1
print(count)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.