Add comprehensive model card for faster-rcnn-bdd-vanilla

16a0d2c verified 3 months ago

3.63 kB

	---
	license: mit
	library_name: pytorch
	tags:
	- faster-rcnn
	- object-detection
	- computer-vision
	- pytorch
	- bdd100k
	- autonomous-driving
	- BDD 100K
	- from-scratch
	pipeline_tag: object-detection
	datasets:
	- bdd100k
	widget:
	- src: https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/bounding-boxes-sample.png
	example_title: "Sample Image"
	model-index:
	- name: faster-rcnn-bdd-vanilla
	results:
	- task:
	type: object-detection
	dataset:
	type: bdd100k
	name: Berkeley DeepDrive (BDD) 100K
	metrics:
	- type: mean_average_precision
	name: mAP
	value: "TBD"
	---

	# Faster R-CNN - Berkeley DeepDrive (BDD) 100K Vanilla

	Faster R-CNN model trained from scratch on Berkeley DeepDrive (BDD) 100K dataset for object detection in autonomous driving scenarios.

	## Model Details

	- Model Type: Faster R-CNN Object Detection
	- Dataset: Berkeley DeepDrive (BDD) 100K
	- Training Method: trained from scratch
	- Framework: PyTorch
	- Task: Object Detection

	## Dataset Information

	This model was trained on the Berkeley DeepDrive (BDD) 100K dataset, which contains the following object classes:

	car, truck, bus, motorcycle, bicycle, person, traffic light, traffic sign, train, rider

	### Dataset-specific Details:

	Berkeley DeepDrive (BDD) 100K Dataset:
	- 100,000+ driving images with diverse weather and lighting conditions
	- Designed for autonomous driving applications
	- Contains urban driving scenarios from multiple cities
	- Annotations include bounding boxes for vehicles, pedestrians, and traffic elements

	## Usage

	This model can be used with PyTorch and common object detection frameworks:

	```python
	import torch
	import torchvision.transforms as transforms
	from PIL import Image

	# Load the model (example using torchvision)
	model = torch.load('path/to/model.pth')
	model.eval()

	# Prepare your image
	transform = transforms.Compose([
	transforms.ToTensor(),
	])

	image = Image.open('path/to/image.jpg')
	image_tensor = transform(image).unsqueeze(0)

	# Run inference
	with torch.no_grad():
	predictions = model(image_tensor)

	# Process results
	boxes = predictions[0]['boxes']
	scores = predictions[0]['scores']
	labels = predictions[0]['labels']
	```

	## Model Performance

	This model was trained from scratch on the Berkeley DeepDrive (BDD) 100K dataset using Faster R-CNN architecture.

	## Architecture

	Faster R-CNN (Region-based Convolutional Neural Network) is a two-stage object detection framework:

	1. Region Proposal Network (RPN): Generates object proposals
	2. Fast R-CNN detector: Classifies proposals and refines bounding box coordinates

	Key advantages:
	- High accuracy object detection
	- Precise localization
	- Good performance on small objects
	- Well-established architecture with extensive research backing

	## Intended Use

	- Primary Use: Object detection in autonomous driving scenarios
	- Suitable for: Research, development, and deployment of object detection systems
	- Limitations: Performance may vary on images significantly different from the training distribution

	## Citation

	If you use this model, please cite:

	```bibtex
	@article{ren2015faster,
	title={Faster r-cnn: Towards real-time object detection with region proposal networks},
	author={Ren, Shaoqing and He, Kaiming and Girshick, Ross and Sun, Jian},
	journal={Advances in neural information processing systems},
	volume={28},
	year={2015}
	}
	```

	## License

	This model is released under the MIT License.

	## Keywords

	Faster R-CNN, Object Detection, Computer Vision, BDD 100K, Autonomous Driving, Deep Learning, Two-Stage Detection