numind
/

NuNER_Zero

 ---
 license: mit
+datasets:
+- numind/NuNER
+library_name: gliner
+language:
+- en
+pipeline_tag: token-classification
+tags:
+- entity recognition
+- NER
+- named entity recognition
+- zero shot
+- zero-shot
 ---
+NuZero - is the family of Zero-Shot Entity Recognition models inspired by [GLiNER](https://huggingface.co/papers/2311.08526) and built with insights we gathered throughout our work on [NuNER](https://huggingface.co/collections/numind/nuner-token-classification-and-ner-backbones-65e1f6e14639e2a465af823b).
+The key difference between NuZero Token in comparison to GLiNER is the possibility to **detect entities that are longer than 12 tokens**, as NuZero Token operates on the token lever rather than on the span level. Also, NuZero token is 1% more intelligent on average.
+<p align="center">
+<img src="zero_shot_performance_unzero_token.png">
+</p>
+## Installation & Usage
+```
+!pip install gliner
+```
+**NuZero requires labels to be lower-cased**
+```python
+from gliner import GLiNER
+model = GLiNER.from_pretrained("numind/NuZero_span")
+# NuZero requires labels to be lower-cased!
+labels = ["person", "award", "date", "competitions", "teams"]
+labels [l.lower() for l in labels]
+text = """
+"""
+entities = model.predict_entities(text, labels)
+for entity in entities:
+    print(entity["text"], "=>", entity["label"])
+```
+## Fine-tuning
+## Citation
+### This work
+```bibtex
+@misc{bogdanov2024nuner,
+      title={NuNER: Entity Recognition Encoder Pre-training via LLM-Annotated Data},
+      author={Sergei Bogdanov and Alexandre Constantin and Timothée Bernard and Benoit Crabbé and Etienne Bernard},
+      year={2024},
+      eprint={2402.15343},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL}
+}
+```
+### Previous work
+```bibtex
+@misc{zaratiana2023gliner,
+      title={GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer},
+      author={Urchade Zaratiana and Nadi Tomeh and Pierre Holat and Thierry Charnois},
+      year={2023},
+      eprint={2311.08526},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL}
+}
+```