Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

the hardest part in training model in foreign languages is to get correctly labeled dataset. I worked with pretrain model on Polish language documents and based on this experience it is relatively good if you are using some text similarity measures. There are some examples/pretrain models with Korean/English/French language


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: