Generate CARLA Datasets

To effectively run models in a full-stack environment in Carla and/or to be able to evaluate Carla scenarios offline, we'll need to create a datasets of sensor and truth data from Carla. This guide will walk us through how to do so and exactly what configuration knobs one can change to alter the composition of the dataset.

Preparation

Scenario Configuration

We'll want to create a custom scenario configuration file. This configuration file will most importantly define the density of other objects in the scene, the parameters for the truth recorder, and the sensor configurations.

Consideration: Sensor Setup

Let's say we wish to capture camera and LiDAR data from an ego. We want the cameras to be spaced out around the ego vehicle so we can get diverse viewing angles of the scene. We'll need to define a set of camera attributes including: - sensor_tick: the interval between sensor captures - fov: field of view of the sensor - location: ego-relative sensor placement - rotation: ego-relative sensor rotation

for all sensors. Also: - image_size_x, image_size_y for cameras - channels, rotation_frequency for LiDAR

Importantly, the LiDAR's rotation_frequency should be the same as the client rate parameter in the same YAML file.

Consideration: Occlusion

One of the MAJOR shortcomings to native Carla, in my opinion, is that, given an ego vehicle and an npc character with a bounding box, there is no way to know whether that npc character is viewable by the ego vehicle. To put it more concretely, there is no way to tell if, e.g., there is a building blocking the view of the npc from the ego's point of view. If I'm mistaken here, PLEASE someone let me know, as this would greatly simply a few things...

Anyway, this is actually a very important fact for the generation of a Carla dataset. Specificly, we need to know when training and evaluating our perception algorithms which objects are viewable by the ego or are partially or (more importantly) completely occluded by some element of the scene. To do so, we need to do one of two things: (option 1): include a LiDAR sensor with 360 degree coverage. In post-processing, use the number of LiDAR points inside the bounding box to determine the level of occlusion (i.e., objects behind buildings will have no points in them). This could work except for the fact that it is difficult, nigh impossible, to determine the level of occlusion (e.g., unoccluded, partial, most, etc.). Instead, we could do (option 2): include a camera depth sensor at the exact same position as a regular RGB camera. Use the depth image to determine what fraction of the depth values in the object's 2D bounding box projection appear to be at the right depth according to the 3D bounding box. The benefit of this approach is that we can get a much more granular picture of the occlusion of objects. The downside is that, instead of just a centrally mounted LiDAR, we need enough cameras to cover the entire 360 degree field of view. (Has anyone tried a 360 degree field of view camera?). To do so, we take the approach of placing depth cameras at the position of every RGB camera.

Consideration: Simulation Speed

We want to be able to execute this data collection as fast as we can. To reduce the burden on the cpus and gpus, we'll disable the pygame display. You may still be able to observe a few things in the Carla docker, but good luck finding the ego vehicle!

Putting This Together

All together, we may get a configuration file of something like:

---
client:
  rate: 20

display:
  enabled: false

world:
  n_random_vehicles: 300
  n_random_walkers: 0

recorder:
  record_truth: true
  format_as: ['avstack']

infrastructure:
  # use defaults...

ego:
  idx_spawn: 'randint'
  idx_vehicle: 'lincoln'
  idx_destination: null
  roaming: false
  autopilot: true
  respawn_on_done: true
  max_speed: 30
  sensors:
    - camera 0:
        name: 'CAM_FRONT'
        attributes:
          sensor_tick: 0.10
          fov: 90
          image_size_x: 1600
          image_size_y: 900
        save: true
        noise: {}
        transform:
          location:
            x: 1.6
            y: 0
            z: 1.6
          rotation:
            pitch: 0
            yaw: 0
            roll: 0
    - depthcam 0:
        name: 'CAM_FRONT_DEPTH'
        attributes:
          sensor_tick: 0.10
          fov: 90
          image_size_x: 1600
          image_size_y: 900
        save: true
        noise: {}
        transform:
          location:
            x: 1.6
            y: 0
            z: 1.6
          rotation:
            pitch: 0
            yaw: 0
            roll: 0
    - lidar 0:
        name: 'LIDAR_TOP'
        save: true
        attributes:
          sensor_tick: 0.10
          channels: 32
          rotation_frequency: 20  # needs to be the same as sim rate
          range: 100.0
        noise: {}
        transform:
          location:
            x: -0.5
            y: 0
            z: 1.8

EgoVehicleStack Configuration

This is the easy part. For generating a dataset, we just need an autopilot vehicle! We'll want the ego vehicle to explore the town thoroughly and follow most traffic rules. As a result, we can simply invoke the PassthroughAutopilotVehicle.

Running

We'll create a run script that looks like this:

#!/usr/bin/env bash

VERSION=${1:-0.9.13}
N_SCENARIOS=${2:-20}
MAX_SCENARIO_LEN=${3:-20}

python exec_standard.py \
    --n_scenarios $N_SCENARIOS \
    --max_scenario_len $MAX_SCENARIO_LEN \
    --config_avstack 'PassthroughAutopilotVehicle' \
    --config_carla 'scenarios/data_capture.yml' \
    --seed 1 \
    --version $VERSION

You'll notice that there are a could more parameters, including N_SCENARIOS and MAX_SCENARIO_LEN. To encourage diversity of scenes and prevent traffic stops from dominating the retained data, we include a max scenario length parameter. Once the scenario hits this number of "simulation-world-seconds", it will restart the simulation entirely, spawning at a new location for the ego and npcs. Similarly, we can set a maximum number of scenarios to capture so that, if you step away from your machine, you don't exhaust the hard drive.

Postprocessing

After we run the data capture, our results will be saved in a folder called sim-results with a subfolder as run_YYYY_MM_DD_HH:MM:SS, filling in the start time of the data capture, for each of the scenario runs. More runs will yield more timestamped subfolders.

To prepare this as a tried and true dataset that respects occlusions and has labels associated with sensor data, we'll need a postprocessing script. Within the lib-avstack-api repository, we've included a file called postprocess_carla_objects.py. Running this script by passing in the location of the sim-results folder will initiate postprocessing on all of the scenes.

For instance, if you are at the location carla-sandbox/submodules/lib-avstack-api and you just ran run_capture_data_random.sh from the carla-sandbox/examples folder, then to perform postprocessing on the newly-generated Carla data, you can run (from a poetry shell or by prepending poetry run, of course)

python postprocess_carla_objects.py ../../examples/sim-results

This postprocessing will do a few things. Specifically, it will: - Put npc coordinates into an ego-relative coordinate frame - For each sensor, create a truth file for npcs - To populate each truth file, filter objects to those within the field of view of each sensor - On filtered objects, run either lidar-based or depth-image-based occlusion finding and filter objects by occlusion levels. - Save to a folder called objects_sensor

Once you've postprocessed, move your sim-results folder to a safe location and call it something else - "my-amazing-carla-dataset" will do... Now, you can manage that dataset just like the KITTI or nuScenes datasets with avapi. The CarlaScenesManager will work like a charm!