Giter Site home page Giter Site logo

fabvio / ld-lsi Goto Github PK

View Code? Open in Web Editor NEW
89.0 89.0 18.0 292.41 MB

Deep learning based lane/freespace detector embedded in ROS node (built for UC3M LSI)

CMake 10.74% Python 89.26%
autonomous-car autonomous-driving autonomous-vehicles cnn convolutional-neural-networks deep-learning freespace lane-detection ros

ld-lsi's People

Contributors

fabvio avatar khizar-anjum avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

ld-lsi's Issues

Training is not happening with train.py file

epocsh
I am using the train.py file to just train the network on BDD dataset with 10 epochs but the results are almost same for each epoch. I made some changes in train.py file to train it for 10 epochs. Training is performed on just 10% of data. Could you please have look ?

@khizar-anjum
@fabvio

Q2) Where in the code we are passing labels for road classification?

Here is the code:

import torch
import torchvision
import os
from glob import glob
from PIL import Image
from utils import AverageMeter, iou, debug_val_example
from tqdm import tqdm
import numpy as np

class CrossEntropyLoss2d(torch.nn.Module):
#NLLLoss2d is negative log-likelihood loss
#it returns the semantic segmentation cross-entropy loss2d
def init(self, weight=None):
super().init()
self.loss = torch.nn.NLLLoss(weight)

def forward(self, outputs, targets):
    outputs, _ = outputs
    return self.loss(torch.nn.functional.log_softmax(outputs, dim=1), targets) 

def train(model, criterion, optimizer, data_loader, debug_data_loader, device, ntrain_batches):
model.train()
top1 = AverageMeter('Acc@1', ':6.2f')
avgloss = AverageMeter('Loss', '1.5f')

cnt = 0
for data in tqdm(data_loader):
    cnt += 1
    model.to(device)
    #print('.',end='')
    image, target = data['image'].to(device), data['label'].squeeze(1)
    target = torch.round(target*255) #converting to range 0-255
    target = target.type(torch.int64).to(device)
    output = model(image)
    loss = criterion(output, target)
    #  clears old gradients from the last step (otherwise you’d just accumulate the gradients from all loss.backward() calls).
    optimizer.zero_grad()
    # computes the derivative of the loss w.r.t. the parameters (or anything requiring gradients) using backpropagation
    loss.backward()
    # performs a parameter update based on the current gradient and update rule (for ex sgd)
    optimizer.step()
    acc1 = iou(output, target)
    #print("iou", acc1)
    top1.update(acc1, image.size(0))
    avgloss.update(loss, image.size(0))
    # if cnt%10 == 0:
    #     #debug val example checks and prints a random debug example every
    #     #10 batches.
    #     #it uses cpu, so might be slow. 
    #     #debug_val_example(model, debug_data_loader)
    #     print('Loss', avgloss.avg)

    #     print('Training: * Acc@1 {top1.avg:.3f}'
    #                 .format(top1=top1))
        
    # if cnt >= ntrain_batches:
    #     return 
print('Loss', avgloss.avg)
print('Training: * Acc@1 {top1.avg:.3f}'
                    .format(top1=top1))
# print('Full imagenet train set:  * Acc@1 {top1.global_avg:.3f}'
#             .format(top1=top1))
return

class berkely_driving_dataset(torch.utils.data.Dataset):
def init(self, path, type='train', transform=None, color = True):
# dataloader for bdd100k segmentation dataset
# path should contain the address to bdd100k folder
# it generally has the following diretory structure
"""
- bdd100k
- drivable_maps
- color_labels
- labels
- images
- 100k
- 10k
"""
# type can either be 'train' or 'val'
self.path = path
self.type = type
self.transform = transform
self.imgs = glob(os.path.join(self.path, 'images/100k/' + self.type + '/*.jpg'))
if color is True:
print(' True color')
self.labels = [os.path.join(self.path, 'drivable_maps/color_labels/' + self.type,
x.split('/')[-1][:-4] + '_drivable_color.png') for x in self.imgs]
else:
print('Color False')
self.labels = [os.path.join(self.path, 'drivable_maps/labels/' + self.type,
x.split('/')[-1][:-4]+'_drivable_id.png') for x in self.imgs]
#print('labels', self.labels)
self.length = len(self.imgs)

# so that len(dataset) returns the size of the dataset.
def __len__(self):
    return self.length

to support the indexing such that dataset[i] can be used to get ith sample

def __getitem__(self, idx):
    if torch.is_tensor(idx):
        idx = idx.tolist()

    #print('idx', idx)
    img_name = self.imgs[idx]
    image = Image.open(img_name)
    label = Image.open(self.labels[idx])
    if self.transform:
        image = self.transform(image)
        label = self.transform(label)
    #print('img', image, 'label', label.shape)
    return {'image':image, 'label':label}

if name=='main':
"""
Here is an example of how to use these functions for training
This script is designed to train on berkely driving dataset. Therefore, the
PATH_TO_BERKELY_DATASET variable points to the root of that dataset. You might
have to edit it.
"""

#DEFINING SOME IMPORTANT VARIABLES
PATH_TO_BERKELY_DATASET = '/home/sachin/Desktop/sampled_data/bdd100k'

#loading libraries
from models import erfnet_road

#loading models
roadnet = erfnet_road.Net()
if torch.cuda.is_available():
    device = torch.device("cuda")

else:
    device = torch.device("cpu")
  

loading weights
model_w = torch.load('/home/sachin/Desktop/ld-lsi/res/weights/weights_erfnet_road.pth')
new_mw = {}
for k,w in model_w.items():
    new_mw[k[7:]] = w
roadnet.load_state_dict(new_mw)

roadnet.to(device)
roadnet.eval();


# Making Dataloaders
bdd_transforms = torchvision.transforms.Compose([
    torchvision.transforms.Resize((360, 640)),
    torchvision.transforms.RandomHorizontalFlip(),
    torchvision.transforms.ToTensor()
])

bdd_train = berkely_driving_dataset(PATH_TO_BERKELY_DATASET, transform=bdd_transforms,  type='train', color = False)
bdd_val = berkely_driving_dataset(PATH_TO_BERKELY_DATASET, transform=bdd_transforms,  type='val', color = False)

sampler_train = torch.utils.data.RandomSampler(bdd_train)
print('train shape', sampler_train)
sampler_val = torch.utils.data.RandomSampler(bdd_val)
print('val shape', sampler_val)

dl_train = torch.utils.data.DataLoader(
    bdd_train, batch_size=32,
    sampler=sampler_train)

#for data in dl_train:
 #   print('k', data['image'].shape) 
# the valiation only works with a batchsize of 1
dl_val = torch.utils.data.DataLoader(
    bdd_val, batch_size=32,
    sampler=sampler_val)

#defining losses
criterion = CrossEntropyLoss2d()
optimizer = torch.optim.Adam(roadnet.parameters(), 10e-3)

#training an epoch for 100 batches
for epoch in range(0, 11):
	print("epoch:", epoch)
	train(roadnet, criterion, optimizer, dl_train, dl_val, device, 100)

Error when running the node

Hi,
I am getting an error running the code. The Error appears in cnn_node.py in line 57:
output = self.cnn(input_tensor)
The exception thrown is as follows:
Cannot obtain output. Exception: size mismatch, m1: [1 x 30720], m2: [7680 x 1024] at /pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:249
Any ideas would be appreciated!

when i run the code an error has occurred:size mismatch

I have downloaded the model, and there is an error when executing the code. The print message is :
File "/home/od/ld-lsi/scripts/erfnet_road.py", line 126, in forward
output_road = self.road_linear_1(output_road)
RuntimeError: size mismatch, m1: [1 x 10240], m2: [7680 x 1024] at /opt/conda/conda-bld/pytorch_1556653215914/work/aten/src/THC/generic/THCTensorMathBlas.cu:268

. If I change the 98th line of erfnet_road.py: change “self.road_linear_1 = nn.Linear(512 * 3 * 5, 1024)” to "self.road_linear_1 = nn.Linear(512 * 4 * 5, 1024)" in class Encoder(nn.Module), the program will execute normally. How to explain this situation? Am I right? Thanks!

Confused in loading weights

#loading weights

  1. model_w = torch.load('Desktop/ld-lsi/res/weights/weights_erfnet_road.pth')
  2. new_mw = {}
  3. for k,w in model_w.items():
  4. new_mw[k[7:]] = w
    
  5. roadnet.load_state_dict(new_mw)
roadnet.to(device)
roadnet.eval();

I am a little confused here:
1). In 1. we are loading pretrained weights of erfnet_road model, so these are the weights of the model which is mentioned in the paper (giving 76% acc on BDD100k) or these are the pretrained weights of encoder trained on ImageNet dataset?
2) so what line 3 and 4 are doing is taking each layer, omitting module word from there and taking its corresponding weights and saving them in new_mw dict and everything else remains the same then what is difference between new_mw and model_w since they both have the same weights for each layer.
3) What is the role of pretrained network (weights_erfnet_road.pth) here? What if I want to train the model from scratch i.e without using pretrained weights can I get the same results as mentioned in the paper ?

Thank You!

request for training code

Hi,

Its really awesome work. I need training script and code. May request for it. I am running the code without catkin, geometry_msgs and rospy. Currently running your inference code using CUDA 10.1 Pytorch 1.3.0. Soon I will contribute all my new changes. Meanwhile, it would be great if you can share training code.

Thanks,
Muhammad Ajmal Siddiqui

Input size,type and ROS version

Thank you for a great code to share.
I want to ask you a input resolution and format and ROS version

1.As for the input size
As I read your paper and it says you tested bbd dataset
so I publish 1280HD jpg to cnn_node
I tested using imshow at cnn_node and I successfully show image before converting to torch tensor

input_tensor = torch.from_numpy(image)

but the following erorr occurs. So I want to know the correct input size or input format.

[ERROR] [1564474751.061718]: Cannot obtain output. Exception: size mismatch, m1: [1 x 30720], m2: [7680 x 1024] at /opt/conda/conda-bld/pytorch_1535488076166/work/aten/src/THC/generic/THCTensorMathBlas.cu:249
[ERROR] [1564474751.062359]: Cannot publish message. Exception: local variable 'ego_lane_points' referenced before assignment
[ERROR] [1564474751.063066]: Visualization error. Exception: local variable 'output' referenced before assignment

2.It's compatible with lunar version but I want to use kinetic ver so is it possible to use this in kinetic version and if it's easy change please tell me .

Thank you very much

Cnn does not find any lanes, when trying to infer on own data

Hi there!
First of all, thank you very much for sharing the code and for the awesome paper, you have released. Really great work!
Now, to the problem. I am trying to infer on my own data, using pretrained models, and results are very bad. Ego_lane_points and other_lane_points are always empty, though data is pretty simillar to bdd_data. Maybe I am missing something? Should I somehow preprocess data or maybe there is a need to train the net?
Also I have converted the code to not use ROS and use python3, instead of python2, just to test it out on the few images, so I could easily break something in the code, but I just deleted all the ros stuff and few things for python3(ex. substitung '/' with a '//').

This is cnn "node" code

class LDCNNNode:
    """
        CNN Node. It takes an image as input and process it using the neural network. Then it resizes the output
        and publish it on a topic.
    """
    def img_received_callback(self, image, name):
        '''
            Callback for image processing
            It submits the image to the CNN, extract the output, then resize it for clustering
            and publishes it on a topic
        
              Args:
                  image: image published on topic by the camera
        '''
        try:
            ### Pytorch conversion
            print("Received image")
            start_t = time.time()
            input_tensor = torch.from_numpy(image)
            input_tensor = input_tensor.float() // 255
            input_tensor = input_tensor.permute(2,0,1).unsqueeze(0)
            print(input_tensor.size())
        except Exception as e:
            print("Cannot convert image to pytorch. Exception: %s" % e)

        try:
            ### PyTorch 0.4.0 compatibility inference code
            if torch.__version__ < "0.4.0":
                input_tensor = Variable(input_tensor, volatile=True)
                output = self.cnn(input_tensor)
            else:
                with torch.no_grad():
                    input_tensor = Variable(input_tensor)
                    output = self.cnn(input_tensor)

            if self.with_road:
                output, output_road = output
                road_type = output_road.max(dim=1)[1][0]
            ### Classification
            output = output.max(dim=1)[1]
            output = output.float().unsqueeze(0)

            ### Resize to desired scale for easier clustering
            output = F.interpolate(output, size=(output.size(2) // self.resize_factor, output.size(3) // self.resize_factor) , mode='nearest')

            ### Obtaining actual output
            ego_lane_points = torch.nonzero(output.squeeze() == 1)
            other_lanes_points = torch.nonzero(output.squeeze() == 2)

            ego_lane_points = ego_lane_points.view(-1).cpu().numpy()
            other_lanes_points = other_lanes_points.view(-1).cpu().numpy()

        except Exception as e:
            print("Cannot obtain output. Exception: %s" % e)

        print("-ego: {}".format(ego_lane_points))
        self.pub.publish(ego_lane_points,other_lanes_points,-1 if not self.with_road else road_type,name)
        self.time.append(time.time() - start_t)
        print("Sent lanes information to clustering node with " \
                 + " %s ego lane points and %s other lanes points. %s fps" % (len(ego_lane_points), len(other_lanes_points), len(self.time) // sum(self.time)))
        ### Debug visualization options
        if self.debug:
            try:
                # Convert the image and substitute the colors for egolane and other lane
                output = output.squeeze().unsqueeze(2).data.cpu().numpy()
                output = output.astype(np.uint8)

                output = cv2.cvtColor(output, cv2.COLOR_GRAY2RGB)
                output[np.where((output == [1, 1, 1]).all(axis=2))] = COLORS_DEBUG[0]
                output[np.where((output == [2, 2, 2]).all(axis=2))] = COLORS_DEBUG[1]

                # Blend the original image and the output of the CNN
                output = cv2.resize(output, (image.shape[1], image.shape[0]), interpolation=cv2.INTER_NEAREST)
                image = cv2.addWeighted(image, 1, output, 0.4, 0)
                if self.with_road:
                    cv2.putText(image, ROAD_MAP[road_type], (20, 40), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 2)

                # Visualization
                print("Visualizing output")
                cv2.imwrite("./out/cnn_{}.jpg".format(name), cv2.resize(image, (320, 240), cv2.INTER_NEAREST))
                #cv2.waitKey(1)
            except Exception as e:
                print("Visualization error. Exception: %s" % e)

    def __init__(self):
        """
            Class constructor.
        """
        try:
            # Adding models path to PYTHONPATH to import modules
            print(os.path.join(MODULE_PATH,'res','models'))
            sys.path.insert(0, os.path.join(MODULE_PATH,'res','models'))

            # Initialize CNN parameters with defaults
            model_name = 'erfnet'
            weights_name = 'weights_erfnet.pth'
            self.resize_factor = 1
            self.debug = True
            self.with_road = False
            queue_size = 10
            
        except Exception as e:
            print("Cannot load parameters. Check your roscore. %s" % e)

        try:
            weights_path = os.path.join(MODULE_PATH, 'res', 'weights', weights_name)
            print(weights_path)
            # Assuming the main constructor is method Net()
            self.cnn = importlib.import_module(model_name).Net()

            # try on cpu
            device = torch.device("cuda" if torch.cuda.is_available() else "cpu") 
            model_dict = torch.load(weights_path, map_location=lambda storage, loc: storage)
            self.cnn = torch.nn.DataParallel(self.cnn, device_ids=[0]).to(device)
            print(device)
            self.cnn.load_state_dict(model_dict)
            self.cnn.eval()

            print("Initialized CNN %s", model_name)
        except Exception as e:
            print("Cannot load neural network. Exception: %s" % e)

        self.time = []

        try:            
            # Publisher interface to send messages to the clustering node
            self.pub = ros_interface_pub(LDClusteringNode())

            # ROS node setup
        except Exception as e:
            print("Cannot initialize ros node. Exception: %s" % e)    

This is how read images

def start_inferring(node, path):
    for image in path.iterdir():
        print("--- Processing {} ---".format(str(image)))
        oriimg = cv2.imread(str(image),cv2.IMREAD_COLOR)
        print(oriimg.shape)

        img = cv2.resize(oriimg,(640,360))
        node.img_received_callback(img, image.stem)
        print("---------------------------\n")       

This is link to the data sample, which I infer

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.