Python is commonly seen as the AI/ML language, but is often a dull blade due to unsafe typing and being slow, like really slow. Many popular natural language processing toolkits only have Python APIs, and we want to see that change. At , we use Go for the majority of our data processing tasks because we can write simple and fast code. Today we are open-sourcing a tool that has helped make our ML lives easier in Go. Say hello to .
What is CoNLL-U?
The Conference on Natural Language Learning (CoNNL) has created multiple file-formats for storing natural language annotations. is one such format and is used by the , which hosts many annotations of textual data. In order to use these corpora, we need a parser that makes it simple for developers to utilize the data.
How Does Go-Conllu Help?
parses conllu data. It is a simple and reliable way to import conllu data into your application as Go structs.
Let's take a look at the example quick-start code from the Readme. First, download the package.
go get github.com/nuvi/go-conllu
Then in a new project:
package main
import (
"fmt"
"log"
conllu "github.com/nuvi/go-conllu"
)
func main() {
sentences, err := conllu.ParseFile("path/to/model.conllu")
if err != nil {
log.Fatal(err)
}
for _, sentence := range sentences {
for _, token := range sentence.Tokens {
fmt.Println(token)
}
fmt.Println()
}
}
All the sentences and tokens in the corpus will be printed to the console.If you need a .conllu corpus file you can download the Universal Dependencies English training model here:
Thanks For Reading
- Follow us on Twitter if you have any questions or comments
- Take game-like coding courses on
- to our Newsletter for more educational articles
Previously published at