File size: 4,851 Bytes
f0f2280
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
---
license: mit
language:
- en
library_name: spacy
tags:
- named-entity-recognition
- b2b
- ecommerce
- order-processing
- product-extraction
- spacy
datasets:
- custom
metrics:
- f1
- precision
- recall
model-index:
- name: b2b-ecommerce-ner
  results:
  - task:
      type: token-classification
      name: Named Entity Recognition
    dataset:
      type: custom
      name: B2B Ecommerce Orders
    metrics:
    - type: f1
      value: 0.82
      name: F1 Score
    - type: precision 
      value: 0.82
      name: Precision
    - type: recall
      value: 0.81
      name: Recall
---

# B2B Ecommerce NER Model

## Model Description

This is a Named Entity Recognition (NER) model specifically trained for B2B ecommerce order processing. The model extracts structured information from retailer-to-manufacturer order text, enabling automated order capture and processing.

## Supported Entities

The model identifies the following entity types:

- **PRODUCT**: Product names and descriptions (e.g., "Coca Cola", "Golden Dates", "Chocolate Cleanser")
- **QUANTITY**: Order quantities (e.g., "5", "10", "twenty")  
- **SIZE**: Product sizes and measurements (e.g., "500ML", "250G", "1.25L")
- **UNIT**: Units of measurement (e.g., "units", "bottles", "packs")

## Features

- **High Accuracy**: Achieves F1 score of 0.82 on B2B ecommerce order data
- **Product Catalog Matching**: Includes fuzzy matching against a comprehensive product catalog
- **Multi-language Support**: Handles mixed English/Hindi text common in Indian B2B commerce
- **Real-world Patterns**: Trained on actual retailer order patterns and variations

## Usage

### Basic Usage

```python
from huggingface_model.model import B2BEcommerceNER

# Load the model
model = B2BEcommerceNER.from_pretrained("path/to/model")

# Extract entities from order text
results = model.predict(["Order 5 bottles of Coca Cola 650ML"])

print(results[0])
# Output: {
#   'text': 'Order 5 bottles of Coca Cola 650ML',
#   'entities': {
#     'products': [{'text': 'Coca Cola', 'label': 'PRODUCT', 'start': 19, 'end': 28}],
#     'quantities': [{'text': '5', 'label': 'QUANTITY', 'start': 6, 'end': 7}],
#     'sizes': [{'text': '650ML', 'label': 'SIZE', 'start': 29, 'end': 34}],
#     'units': [{'text': 'bottles', 'label': 'UNIT', 'start': 8, 'end': 15}],
#     'catalog_matches': [...]
#   }
# }
```

### Pipeline Usage

```python
from huggingface_model.model import pipeline

# Create NER pipeline
ner_pipeline = pipeline("ner", model="b2b-ecommerce-ner")

# Process text
entities = ner_pipeline("I need 10 packs of biscuits")
```

### Batch Processing

```python
# Process multiple orders at once
orders = [
    "Order 5 Coke Zero 650ML",
    "Send 12 bottles of mango juice", 
    "I need 3 units of Chocolate Cleanser 500ML"
]

results = model.predict(orders)
```

## Training Data

The model was trained on a dataset of 500 B2B ecommerce orders containing:
- Real retailer-to-manufacturer communications
- Mixed English/Hindi text patterns
- Various product categories (beverages, food items, personal care)
- Different order formats and structures
- 1,002 labeled entities across 4 entity types

## Model Performance

| Metric | Score |
|--------|-------|
| F1 Score | 0.82 |
| Precision | 0.82 |
| Recall | 0.81 |

The model shows strong performance across all entity types, with particularly good results on PRODUCT and QUANTITY recognition.

## Product Catalog Integration

The model includes a fuzzy matching system that can match extracted products against a catalog of 1,855+ products, providing:

- **Brand Matching**: Match to specific brands (e.g., "Coca Cola", "Ziofit")
- **Product Variants**: Find different sizes/variants of the same product
- **Confidence Scores**: Numerical confidence for each match (0-100)
- **SKU Mapping**: Direct mapping to product SKUs for order processing

## Limitations

- Performance may vary on product names not seen during training
- Best results with English text; mixed language support is experimental
- Requires product catalog file for fuzzy matching features
- Based on spaCy framework, not transformer-based

## Technical Details

- **Framework**: spaCy 3.8+
- **Base Model**: en_core_web_sm
- **Training**: Custom NER component with 50 iterations
- **Entity Labels**: 4 custom entity types
- **Input**: Raw text strings
- **Output**: Structured entity information with optional catalog matching

## Installation

```bash
pip install spacy pandas fuzzywuzzy python-levenshtein
python -m spacy download en_core_web_sm
```

## Citation

```bibtex
@misc{b2b_ecommerce_ner_2025,
  title={B2B Ecommerce NER Model for Order Processing},
  author={Your Name},
  year={2025},
  howpublished={Hugging Face Model Hub},
  url={https://huggingface.co/your-username/b2b-ecommerce-ner}
}
```

## License

This model is released under the MIT License.