dataset

CoNLL-2003

CoNLL-2003 is a widely-used benchmark dataset for named entity recognition (NER) tasks in natural language processing (NLP). It consists of English and German news articles annotated with four entity types: persons, organizations, locations, and miscellaneous names. The dataset was created for the CoNLL-2003 shared task and has become a standard for evaluating NER models.

Also known as: CoNLL 2003, CoNLL2003, CoNLL-03, CoNLL03, CoNLL 03
🧊Why learn CoNLL-2003?

Developers should use CoNLL-2003 when training or benchmarking NER models, as it provides a consistent and well-annotated dataset for comparing performance across different algorithms. It is essential for research in information extraction, text mining, and applications like chatbots or search engines that require entity identification.

Compare CoNLL-2003

Learning Resources

Related Tools

Alternatives to CoNLL-2003