Comments (9)
Good point. I used OpenCV for frame extraction. As JPEG is a lossy format, the default quality setting in ffmpeg might differ from OpenCV's default quality. A solution similar to this should work.
import argparse
import os
from pathlib import Path
import cv2
def extract_frames(video_path, frames_dir):
video_name = Path(video_path).stem
video_frames_dir = os.path.join(frames_dir, video_name)
os.makedirs(video_frames_dir, exist_ok=True)
cap = cv2.VideoCapture(video_path)
frame_count = 0
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
frame_path = os.path.join(video_frames_dir, f"{frame_count:06d}.jpg")
cv2.imwrite(frame_path, frame)
frame_count += 1
cap.release()
print(f"Extracted {frame_count} frames from {video_path} to {video_frames_dir}")
def main(videos_dir, frames_dir):
os.makedirs(frames_dir, exist_ok=True)
# Loop through all the video files in the video directory
for video_file in os.listdir(videos_dir):
if video_file.endswith(".avi") or video_file.endswith(".mp4"):
video_path = os.path.join(videos_dir, video_file)
extract_frames(video_path, frames_dir)
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument(
"--videos_dir",
type=str,
required=True,
help="Directory path to the videos.",
)
parser.add_argument(
"--frames_dir",
type=str,
required=True,
help="Directory path to the frames.",
)
args = parser.parse_args()
return args
if __name__ == "__main__":
args = parse_args()
main(args.videos_dir, args.frames_dir)
from anomalyclip.
Thank you for supporting !!! It's indeed this issue.
from anomalyclip.
Hi Kenny, thank you!
Thank you for sharing the code snippet. After adding the following lines of code:
data["net"]["stride"] = 1
data["net"]["ncrops"] = 1
data["net"]["labels_file"] = "data/sht_labels.csv"
data["net"]["normal_id"] = 8
I tested the modified snippet and I observed that the features extracted from here closely align with the ones provided in the Google Drive folder. The only notable difference is in the shape of the first dimension because of the padding applied in test mode (the shape is not [764, 512], where 764 is the number of frames and 512 is the embedding size, but [1024, 512]).
Could you provide more details on how you computed clip_extracted.txt
?
from anomalyclip.
I've kept the code the same except initializing clip to cuda.
L132 of anomaly_clip.py
, I print image_features[0][0][0]
from anomalyclip.
If you are only interested in the CLIP features, then you should use L121. The code at line 132 contains image features rearranged to accommodate videos with arbitrary lengths, enabling compatibility with our Temporal model.
from anomalyclip.
the result on L121 and L132 are the same since I'm only comparing the feature of the first frame
from anomalyclip.
A possibility is that the CLIP model is not loaded correctly. I am using the ViT-B/16 model, which should be automatically downloaded and stored at .cache/clip/ViT-B-16.pt. Could you try removing the cached version and then run the code again? This should download the model again.
from anomalyclip.
I've tried. still the same.
could it be the way I extracted the frames?
I used the following bash script to extract it.
for f in ./videos/*
do
s=${f##*/}
s=${s%.avi}
echo $s
mkdir ./frames/$s
ffmpeg -i $f ./frames/$s/%04d.jpg
done
from anomalyclip.
I am glad to hear that! Thanks for pointing this out, we will be sure to include this information in the readme.
from anomalyclip.
Related Issues (6)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from anomalyclip.