Simba 1.2 Error annotating/labeling behaviors in larval zebrafish,about sgoldenlab/simba

Comments (23)

sgoldenlab commented on July 30, 2024

Hi @kylecnewton ! Sorry about this but I think we can figure this out, the last lines give a hint and I was able to replicate your error:

b'C:\Users\Fish_Behavior\Desktop\SIMBA\1LZF_model\1LZF_model_07_28\project_folder\csv\features_extracted\20200318_AB_7dpf_ctl.csv' does not exist:

SimBA is looking for a CSV file in your project_folder/csv/features_extracted directory with the same name as your directory containing the frames you are labelling. So if you are labelling frames in a folder called 20200318_AB_7dpf_ctl, then there should be a CSV file with features in this path : project_folder/csv/features_extracted/20200318_AB_7dpf_ctl.csv.

You mention that you have extracted the features, but is it possible that the names don't match up? (e.g., the video frame folder has been renamed somewhere along the process?)

I will make sure SimBA prints a statements that is more informative.

from simba.

kylecnewton commented on July 30, 2024

Hi Sam,

I have my extracted frames in C:\Users\Fish_Behavior\Desktop\SIMBA\1LZF_model\1LZF_model_07_28\project_folder\frames\20200318_AB_7dpf_ctl
but there is no corresponding CSV file in the csv\features_extracted folder. This is the case for all 20 videos: frames extracted but no csv file. hmm

I did notice one thing: I analyzed these videos using an old DLC model that had 5 fish body part labels instead of the 7 I defined when I created the Simba project config file. Furthermore, the labels for the 5 body parts in that DLC csv file are not exactly the same as those in the Simba config file. I assume that the number and names of the labeled body parts in DLC model must match those in Simba, correct?

thanks for your help. BTW I love the detail of the step by step tutorials on GitHub - infinitely better than DLC ;)

from simba.

sgoldenlab commented on July 30, 2024

Hey @kylecnewton - to be honest, I have not explored how tracking one set of body-parts and defining another in SimBA affects the workflow, but I'd go back, create a new project, and specifying the correct number of body-parts and order at project creation.

What happens is when you create a project and specify the number of body-parts you have in SimBA, is that you also tell SimBA about how many columns to expect when reading in your pose-estimation files at all different steps. If SimBA founds more, or less columns in your files you are likely to hit errors. The number of body-parts that your project have also dictates how many and which features should be created.

To save your annotated images you first have to generate the features. After correcting the outliers (or pressing skip outlier correction in your case), you should click this button and it should take care of it. Let me know if it works!

And thanks! That warms!! :)

from simba.

kylecnewton commented on July 30, 2024

Hey @sgoldenlab , it worked!!! Here is the breakdown of what I have discovered so far:
1- I had to start over a few times to make sure I did everything exactly as in the tutorial, this included running a new DLCma 2.2b6 model on 1 fish
2- created a new Simba project config with user defined pose config (same 7 exact body parts and labels as DLCma model) and selected multi-animal tracking (very important).
3 - imported videos and DLCma h5 (obviously csv doesn't work)
4 - set video parameters, corrected outliers, set ROI, extracted features and currently labeling behaviors.

I have two questions about outlier correction: First, I chose the left eye and swim bladder for the body parts as they are the most obvious and reliable because the four tail labels are often off. My concern is that these structures are close together (~20px or 1mm apart) and I was wondering if I should choose the eye and tip of the tail (~120px or 5mm apart) even though the tail is tougher to track?

Second, I chose movement and location corrections of 0.7 and 1.5, respectively (it was in the bioRx pre print). Given the small size of the animals (5mm larvae in a 30x30mm arena, filmed at 1024x1024 resolution) does this seem reasonable?

A suggestion about the behavior labeling GUI: the navigation hotkeys are very counter-intuitive (o,e,w,t,s,x). Is there any way to use the actual arrow keys for single frame advance, alt+arrow for 10 frames, tab+arrow for 1 sec? This physical arrangement would make it easier to navigate video without having to constantly refer to the legend for hotkeys spread out across the keyboard. Or you could group all the reverse keys together and the forward keys together but mirroring each other in space and function: w = -10 fr, e = -2 fr , r = -1sec, then u = +1sec, i =+2fr, o = +10fr. I'm not sure if this is feasible but something like this might speed up the labeling process.

thanks again for all the help. I will keep you posted on the progress and let me know if you would like any files to add to the OSF database.

from simba.

sgoldenlab commented on July 30, 2024

Hi @kylecnewton - how does your tracking look pre-outlier correction? Currently, the outlier correction tools struggle when users have user-defined body-parts + multi animal tracking. I have written a fix but the main scripts are not yet calling this new outlier correction function. In your case you only have one animal as I understand, so it will be fine, but as you introduce more animals you might hit an error. I hope to get this in towards end of August..

For the body-parts I don't have a good answer, I think you need to try both ways. The tail end seems more intuitive but it will be no use if it's frequently and massively off.

I updated Simba with your suggested keyboard shortcuts (w,e,r and (u,i,o) - you should see it if you upgrade Simba to 1.2.11. The modifier keys can be problematic in OpenCV which it runs on, otherwise I would have gone for that suggestion.

Thanks!

from simba.

kylecnewton commented on July 30, 2024

Hi @sgoldenlab, I seem to recall the tracking looking good out of DLCma. I used the original bx.h5 files instead of a bx.filtered.h5 files because I wanted to see how the Simba models performed. I suppose I could use the filtered data to see if the tail markers are more stable and reliable. Unfortunately, it is not always clear (to me) how the various GUI options in DLC can truly optimize tracking results. However, the models generated by the new DLCma software for a single fish are FAR SUPERIOR to the tracking generated by previous software versions. Once I optimize the tracking and behavioral classification results for one fish, I plan to add in videos with 2+ fish to see how well everything holds up.

Speaking of behaviors, my first SiMBA model only had one behavior (positive rheotaxis = orienting and swimming into water flow). Should I have created two or more behaviors to give Simba a "choice" for the trees or is one behavior classification enough to begin learning the software? I assumed one behavior could be a yes/no sort of decision tree.

Wow, thanks for incorporating my suggestions. I am excited to see where all of this ML stuff goes and how I can implement it into ethological studies with wildlife in more natural settings. I like your idea of making all the classifiers open access so that we can compare apples to apples across studies. This is part of my motivation to create a "universal" larval zebrafish rheotaxis assay with a fundamental appeal to biomed and wildlife/eco researchers.

from simba.

sgoldenlab commented on July 30, 2024

Hi @kylecnewton - sounds good - I think one of the main path to improve multi-animal tracking in the maDLC interface is the 'tracklets' interface - https://www.youtube.com/watch?v=bEuBKB7eqmk - it comes with some caveats at the moment - the hot-keys doesn't work in Windows environment (it is related to why I didn't implement the modifying keys you suggested: there are some Windows/Linux clashes in how keystrokes are interpreted in OpenCV) so I haven't been able to explore it much, but I think they will sort it soon and multi-animal tracking will become easier. If you have Linux machine you should be good to go though.

One classifier is good enough - it is boolean yes/no - but, SimBA will give you a probability value that the behavior is occurring in each frame, and you titrate the discrimination threshold to get the yes/no decision division exactly right: https://github.com/sgoldenlab/simba/blob/master/docs/Scenario1.md#critical-validation-step-before-running-machine-model-on-new-data. When sharing your classifiers with other using slightly different recording set-ups / video formats / camera angles, shifting this discrimination threshold can often be enough to get it working and validated in the new setting.

from simba.

kylecnewton commented on July 30, 2024

@sgoldenlab ok now it makes sense why the DLC tracklet GUIs are so glitchy to navigate on Windows.

re: Run Machine model/validation - this is my initial attempt to validate the model using a feature extracted video not used in training. As you can see, there are not any nice peaks in the data. I am only looking at positive rheotaxis, or swimming into oncoming flow. The water is flowing top to bottom, so I am just looking for the fish to swim up in the frames. IN the screenshots, I show frames where the fish is performing rheotaxis and where it is not but the models classified it incorrectly (e.g frame 1900). Any thoughts about how I can improve this? More annotated videos? Use filtered DLCma data? Use bodyparts spaced further apart for outlier correction?

I also included a screen shot of the DLC output video showing the 7 landmarks in the raw h5 data files. On further examination, the second landmark from the swimbladder toward the end of the tail seems stable so I will try to use that for outlier correction.

Thanks so much for all the help!

from simba.

sgoldenlab commented on July 30, 2024

Hi @kylecnewton - adding annotations is usually a very good approach but perhaps it's possible to be a bit more surgical than that...

The random forest seems to pick up a near on/off and something happens in this frame - is it possible to look at the video and see what is happening?

My first thought is that you have appended the ROIs as features (I think you mentioned you had earlier?), and the classifier has picked up a relationship between one or more of the the ROI's and your annotations (e.g., most annotations for the animal swimming upwards happened to be when the animal is a certain distance away from, or more likely inside, a certain ROI. If you did not append ROI data I have to think a little more and it would be helpful to know what happens at about 3200 frames - removing the ROI features could help.

My fish experience is VERY(!) limited and I don't know positive rheotaxis looks like :) will the flow always come from the top or can it sometimes come from other angles? If it is the case that it's always from, we can insert a couple of lines in the feature extraction script: check the movements of all body-parts along Y normalized to body-part movements along X, this would come close to the answer I think? We do a version of it to get to tail rattling in the mouse (tail-end movement normalized to centroid and tail-base movement) and may work here too but I don't know if your behavior is as simple as that.

from simba.

kylecnewton commented on July 30, 2024

HI @sgoldenlab, Yes I did append the ROI settings. I guess I assumed it was part of the linear workflow going from tab to tab in the GUI. How do I remove the connection? Do I hit the reset button on the ROI table or remove the ROI entries in the project_config.ini?

Positive rheotaxis is basically swimming upstream. The fish orients toward the source of the stimulus, which in this assay always moves from top to bottom. Negative rheotaxis would be swimming with the flow of water, or downstream.

I am in the middle of retraining the model on the filtered H5 files and with 100K rf N estimators (I wasn't sure if 100K was better than 2K, so I am experimenting). Once this is done, should I remove the ROIs, re-extract features, and retrain the model with 2K estimators? I can also add more annotated videos. I assume this will be an iterative process :)

from simba.

sgoldenlab commented on July 30, 2024

Hi @kylecnewton - first I'm sorry that I hadn't made this clear enough in the documentation / GUI - I will insert a sentence in the GUI next to the Append ROI data to features to emphasize caution, that this is optional, and in fact, should be avoided in most cases. It's only relevant when the behavior you are classifying has a strong relevant spatial component.

For removing the ROI features from your dataset, I don't have a function at the moment - and it will have to be done by hand - how many videos did you annotate? I hope not too many yet..

In your project_folder/csv/targets_inserted directory, you will see the csv files that are used to build your model. Each of these files contain your annotations, all the pose estimation data, and your annotations, and ROI features. The ROI features are right towards the end, and are named according to what you named them, for example:

Go ahead and delete these columns in all the files in the project_folder/csv/targets_inserted directory, and then re-generate your classifiers using the Train Machine Model tab. This way SimBA will focus on swim speed and body-part movements, rtather than pick up some spurious relationships between where in the arena your animal is and the annotations.

It is an "iterative process" forever :) but I think that you will be able to get a classifier that performs well for your behavior relatively quickly. PS. The number of estimators, in my experience, rather quickly reaches ceiling and 2k can often be overkill in our settings.

from simba.

kylecnewton commented on July 30, 2024

HI @sgoldenlab, I looked at the CSV files you mentioned and it appears that I did not append the ROI data to the features. Out of curiousity, I looked at the project_config.ini and noticed this:

[ROI settings]
animal_1_bp = Fish1_SwimBladder
animal_2_bp = No body parts
directionality_data =
visualize_feature_data =
no_of_animals = 1

Should I alter this or leave it alone? Other than this, it seems that I need to add more annotate more videos? I have used 20 so far: 19 for training and 1 for validation. I can add another 20 easily.

from simba.

kylecnewton commented on July 30, 2024

@sgoldenlab - one thing I forgot to ask: my videos are huge (30sec x 200fps=6000 frames = 1.6GB each for AVI) and the extracted frames take up ~100GB or so. Is there any point when I can remove those frames and free up some space? My original files are MOV and much smaller but the conversion to AVI or MP4 doubled the SDD space needed. Any plans to use the MOV wrapper?

in the future I will be shooting at 60fps based on your advice from the workshop - i.e. the behavior is discernible to the naked eye so no need to over sample.

from simba.

sgoldenlab commented on July 30, 2024

Ouch 200 is a lot! I What is your resolution like?

I don't know how far down you can go with fps, but yes, the frames take up a lot of space. The only thing they are really useful for nowadays in SimBA is the human annotation labeling - and I will look to get rid of it when time allows and pull sraight from the videos, together with getting all functions working with MOV. With the mice, we work with 30fps and there is a tool to downsample fps in SimBA and I'd recommend using it prior to pose-estimation:

from simba.

kylecnewton commented on July 30, 2024

@sgoldenlab the resolution is 1024x1024. I am converting new videos to avi and downsampling to 60fps as we speak. Then I will re-run them thru DLCma and add them to the SImba model. Hopefully this will be the solution.

So if I understand you correctly, I can move the frames for the previous 19 videos that I used for annotation and training the model to a storage HDD?

from simba.

kylecnewton commented on July 30, 2024

@sgoldenlab

FYI: when I try to downsample a batch of MOV videos in a folder, I get an error but if I do them individually it works.

ffmpeg version git-2020-06-26-7447045 Copyright (c) 2000-2020 the FFmpeg developers
built with gcc 9.3.1 (GCC) 20200621
configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libsrt --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libgsm --disable-w32threads --enable-libmfx --enable-ffnvcodec --enable-cuda-llvm --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt --enable-amf
libavutil 56. 55.100 / 56. 55.100
libavcodec 58. 93.100 / 58. 93.100
libavformat 58. 47.100 / 58. 47.100
libavdevice 58. 11.100 / 58. 11.100
libavfilter 7. 86.100 / 7. 86.100
libswscale 5. 8.100 / 5. 8.100
libswresample 3. 8.100 / 3. 8.100
libpostproc 55. 8.100 / 55. 8.100
C:/Users/Fish_Behavior/Desktop/SIMBA/1LZF_DLCma_model/project_folder/videos/New: No such file or directory

from simba.

kylecnewton commented on July 30, 2024

@sgoldenlab, oops never mind - I had spaces in the folder name

from simba.

kylecnewton commented on July 30, 2024

Hi @sgoldenlab, okay I have another error :/ I feel like I owe you authorship at this point ;)

This time I was able to import 19 more videos and h5 files for additional annotation then run the outlier correction. Simba generated the new movement and location csv files but gave me this error:

Error: make sure all the videos that are going to be analyzed are represented in the project_folder/logs/video_info.csv file

I checked the file and the 19 new videos were not listed. I appended the new video file names to the video_info.csv and was able to load a new video to annotate the rheotaxis behaviors. Then I got the following error because the 19 new videos did not have the extracted features csv files generated. The old files are present in project_folder/csv/features_extracted.

Thanks again!

Exception in Tkinter callback
Traceback (most recent call last):
File "c:\users\fish_behavior.conda\envs\simba\lib\tkinter_init_.py", line 1705, in call
return self.func(*args)
File "c:\users\fish_behavior.conda\envs\simba\lib\site-packages\simba\labelling_aggression.py", line 135, in
self.generate = Button(self.window, text="Generate / Save csv", command=lambda: save_video(self.window))
File "c:\users\fish_behavior.conda\envs\simba\lib\site-packages\simba\labelling_aggression.py", line 474, in save_video
data = pd.read_csv(input_file)
File "c:\users\fish_behavior.conda\envs\simba\lib\site-packages\pandas\io\parsers.py", line 685, in parser_f
return _read(filepath_or_buffer, kwds)
File "c:\users\fish_behavior.conda\envs\simba\lib\site-packages\pandas\io\parsers.py", line 457, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "c:\users\fish_behavior.conda\envs\simba\lib\site-packages\pandas\io\parsers.py", line 895, in init
self._make_engine(self.engine)
File "c:\users\fish_behavior.conda\envs\simba\lib\site-packages\pandas\io\parsers.py", line 1135, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "c:\users\fish_behavior.conda\envs\simba\lib\site-packages\pandas\io\parsers.py", line 1917, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 382, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 689, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] File b'C:\Users\Fish_Behavior\Desktop\SIMBA\1LZF_DLCma_model\project_folder\csv\features_extracted\\fps_30_20200317_AB_6dpf_ctl.csv' does not exist: b'C:\Users\Fish_Behavior\Desktop\SIMBA\1LZF_DLCma_model\project_folder\csv\features_extracted\\fps_30_20200317_AB_6dpf_ctl.csv'

from simba.

sgoldenlab commented on July 30, 2024

Hi @kylecnewton - no problem at all! This should be an easy one - however, SimBA-related questions are coming thicker and faster and I may not be able to answer very fast, especially if it's more complex.

I good indication of whats going on is in this tutorial: https://github.com/sgoldenlab/simba/blob/master/docs/Scenario4.md

What has happened is that you have added more videos, you need to extract the features for these files too before you start annotating these new videos. When you press Extract features, SimBA looks in your project_folder/csv/outlier_corrected_movement_location and process each csv in this folder, and saves the file in the project_folder/csv/features_extracted directory.

To extract the features for your new files, and no other files, I would look in your project_folder/csv/outlier_corrected_movement_location folder, and remove the files you have already processed features for (perhaps create a new folder called temp and move your already processed files in there: that way, SimBA can no longer see them and won't process them again). Then open your project in SimBA, load your project, and press Extract features. Your new features files should appear in your features_extracted folder. Let me know how that goes!

from simba.

kylecnewton commented on July 30, 2024

Hi @sgoldenlab, your solution is working perfectly - thanks for the help.

Some GUI observations that might help others:

1-After loading the project and clicking the "further imports" tab, the physical placement of "import further tracking data" above import further videos" implies that tracking data should be imported first, then videos. I suggest swapping their placement to ensure people have a logical top to bottom workflow. This would also mimic the placement within the GUI of importing videos then data when you originally created the project file.

2- When labeling behaviors, I like using the arrow buttons and jump advance below the frame display. Clicking on a single frame advance is nice when precisely labeling, but the immediate proximity of the single frame advance to the first/last frame buttons makes it easy to accidentally go the the beginning/end of the video. Then I have to hunt through 6000 frames to find where I was unless I remember the frame number. My suggestion is to make the buttons a little bigger and separate them a bit so that folks don't get themselves lost. Maybe the buttons are small because I am using 4K monitors?

Just my $0.02.

from simba.

kylecnewton commented on July 30, 2024

Hi @sgoldenlab, is there any way to use the DLC skeleton data and angle of orientation of the body segments (say from the swim bladder to the eyes) to filter out false positives? The model has changed due to adding new videos but it is still misidentifying negative rheotaxis (swimming with the flow) as positive rheotaxis (swimming against the flow).

Recall that the videos are shot from above and the flow goes from top to bottom, so positive rheotaxis is swimming with a "northward", "upward", or 12 o'clock orientation. DLC defines a northern oriented body segment as both 0 and 360 degrees (a full circle), then the acceptable range of rheotactic orientation angles would be from 0-45 degrees and 325-360 degrees.

Does that make sense?

from simba.

sgoldenlab commented on July 30, 2024

Thanks @kylecnewton - put in your suggested updates, 5 pixels between the buttons and reversed the order of the menues. If you update SimBA you should see it.

Yes, I think you are right - your behavior is very "spatial" - but in a way that it is not handled well either by the default feature scripts, or by the inclusion static ROIs. The feature you mention, and the features I mentioned earlier, should be enough to get to the classifications you need.

I think the only way to handle it is to use this function in the development wheel of SimBA (pip install simba-uw-tf-dev, where you apply your own feature extraction script that calculates the metrics you need):

For more information, check this doc think this scenario is partly what it is meant for: https://github.com/sgoldenlab/simba/blob/master/docs/extractFeatures.md

I could help you write the first feature extraction script, it shouldn't take me very long (maybe next weekend) - this function needs to be piloted more anyways and this seems like a good opportunity, but you'll need to share the project with me - perhaps zipped through a gdrive if very large - [email protected]

from simba.

kylecnewton commented on July 30, 2024

@sgoldenlab, thanks so much - you're help writing scripts would be essential. I am a noob to coding and command line programs so it would take me forever.

from simba.

Simba 1.2 Error annotating/labeling behaviors in larval zebrafish about simba HOT 23 CLOSED

Comments (23)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent