German NER on Legal Data using BERT
This project consist of the following tasks:
To run this project on localhost
, follow these simple steps:
conda create -n german_bert_ner python=3.9
conda activate german_bert_ner
git clone https://github.com/harshildarji/German-NER-BERT.git
cd
to repo:
cd German-NER-BERT
pip3 install -r requirements.txt
model.pt
, tag_values.pkl
, and tokenizer.pkl
. One can either generate these files by executing through german_bert_ner.ipynb which will take 45-60 minutes or download the latest versions of these files from my DropBox using:
wget https://www.dropbox.com/s/vos8pqwmlbqe0wf/model.pt
wget https://www.dropbox.com/s/u2oojgmmprt0a9d/tag_values.pkl
wget https://www.dropbox.com/s/uj15pab78emefoq/tokenizer.pkl
app.py
as:
python3 app.py
Once app.py
is successfully executed, head over to http://localhost:5000/
.
In the provided text-area, input a German (law) sentence, for example: 1. Das Bundesarbeitsgericht ist gemäß § 9 Abs. 2 Satz 2 ArbGG iVm. § 201 Abs. 1 Satz 2 GVG für die beabsichtigte Klage gegen den Bund zuständig .