NGC | Catalog
CatalogResourcesBioBERT for TensorFlow1

BioBERT for TensorFlow1

Logo for BioBERT for TensorFlow1
BERT for biomedical text-mining.
NVIDIA Deep Learning Examples
Latest Version
April 4, 2023
Compressed Size
33.71 KB

This resource is using open-source code maintained in github (see the quick-start-guide section) and available for download from NGC

In the original BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding paper, pre-training is done on Wikipedia and Books Corpus, with state-of-the-art results demonstrated on SQuAD (Stanford Question Answering Dataset) benchmark.

Meanwhile, many works, including BioBERT, SciBERT, NCBI-BERT, ClinicalBERT (MIT), ClinicalBERT (NYU, Princeton), and others at BioNLP'19 workshop, show that additional pre-training of BERT on large biomedical text corpus such as PubMed results in better performance in biomedical text-mining tasks.

This repository provides scripts and recipe to adopt the NVIDIA BERT code-base to achieve state-of-the-art results in the following biomedical text-mining benchmark tasks: