File size: 4,851 Bytes
f0f2280 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 |
---
license: mit
language:
- en
library_name: spacy
tags:
- named-entity-recognition
- b2b
- ecommerce
- order-processing
- product-extraction
- spacy
datasets:
- custom
metrics:
- f1
- precision
- recall
model-index:
- name: b2b-ecommerce-ner
results:
- task:
type: token-classification
name: Named Entity Recognition
dataset:
type: custom
name: B2B Ecommerce Orders
metrics:
- type: f1
value: 0.82
name: F1 Score
- type: precision
value: 0.82
name: Precision
- type: recall
value: 0.81
name: Recall
---
# B2B Ecommerce NER Model
## Model Description
This is a Named Entity Recognition (NER) model specifically trained for B2B ecommerce order processing. The model extracts structured information from retailer-to-manufacturer order text, enabling automated order capture and processing.
## Supported Entities
The model identifies the following entity types:
- **PRODUCT**: Product names and descriptions (e.g., "Coca Cola", "Golden Dates", "Chocolate Cleanser")
- **QUANTITY**: Order quantities (e.g., "5", "10", "twenty")
- **SIZE**: Product sizes and measurements (e.g., "500ML", "250G", "1.25L")
- **UNIT**: Units of measurement (e.g., "units", "bottles", "packs")
## Features
- **High Accuracy**: Achieves F1 score of 0.82 on B2B ecommerce order data
- **Product Catalog Matching**: Includes fuzzy matching against a comprehensive product catalog
- **Multi-language Support**: Handles mixed English/Hindi text common in Indian B2B commerce
- **Real-world Patterns**: Trained on actual retailer order patterns and variations
## Usage
### Basic Usage
```python
from huggingface_model.model import B2BEcommerceNER
# Load the model
model = B2BEcommerceNER.from_pretrained("path/to/model")
# Extract entities from order text
results = model.predict(["Order 5 bottles of Coca Cola 650ML"])
print(results[0])
# Output: {
# 'text': 'Order 5 bottles of Coca Cola 650ML',
# 'entities': {
# 'products': [{'text': 'Coca Cola', 'label': 'PRODUCT', 'start': 19, 'end': 28}],
# 'quantities': [{'text': '5', 'label': 'QUANTITY', 'start': 6, 'end': 7}],
# 'sizes': [{'text': '650ML', 'label': 'SIZE', 'start': 29, 'end': 34}],
# 'units': [{'text': 'bottles', 'label': 'UNIT', 'start': 8, 'end': 15}],
# 'catalog_matches': [...]
# }
# }
```
### Pipeline Usage
```python
from huggingface_model.model import pipeline
# Create NER pipeline
ner_pipeline = pipeline("ner", model="b2b-ecommerce-ner")
# Process text
entities = ner_pipeline("I need 10 packs of biscuits")
```
### Batch Processing
```python
# Process multiple orders at once
orders = [
"Order 5 Coke Zero 650ML",
"Send 12 bottles of mango juice",
"I need 3 units of Chocolate Cleanser 500ML"
]
results = model.predict(orders)
```
## Training Data
The model was trained on a dataset of 500 B2B ecommerce orders containing:
- Real retailer-to-manufacturer communications
- Mixed English/Hindi text patterns
- Various product categories (beverages, food items, personal care)
- Different order formats and structures
- 1,002 labeled entities across 4 entity types
## Model Performance
| Metric | Score |
|--------|-------|
| F1 Score | 0.82 |
| Precision | 0.82 |
| Recall | 0.81 |
The model shows strong performance across all entity types, with particularly good results on PRODUCT and QUANTITY recognition.
## Product Catalog Integration
The model includes a fuzzy matching system that can match extracted products against a catalog of 1,855+ products, providing:
- **Brand Matching**: Match to specific brands (e.g., "Coca Cola", "Ziofit")
- **Product Variants**: Find different sizes/variants of the same product
- **Confidence Scores**: Numerical confidence for each match (0-100)
- **SKU Mapping**: Direct mapping to product SKUs for order processing
## Limitations
- Performance may vary on product names not seen during training
- Best results with English text; mixed language support is experimental
- Requires product catalog file for fuzzy matching features
- Based on spaCy framework, not transformer-based
## Technical Details
- **Framework**: spaCy 3.8+
- **Base Model**: en_core_web_sm
- **Training**: Custom NER component with 50 iterations
- **Entity Labels**: 4 custom entity types
- **Input**: Raw text strings
- **Output**: Structured entity information with optional catalog matching
## Installation
```bash
pip install spacy pandas fuzzywuzzy python-levenshtein
python -m spacy download en_core_web_sm
```
## Citation
```bibtex
@misc{b2b_ecommerce_ner_2025,
title={B2B Ecommerce NER Model for Order Processing},
author={Your Name},
year={2025},
howpublished={Hugging Face Model Hub},
url={https://huggingface.co/your-username/b2b-ecommerce-ner}
}
```
## License
This model is released under the MIT License.
|