Tokenization into words or sub-word units is a key component of Natural Language Processing pipeline. Modern approaches such as Byte Pair Encoding (Sennrich et al., 2015), WordPiece or SentencePiece (Kudo et al., 2018) segment rare words into sub-tokens i... (more…)
Read more »
Tokio is a Rust framework for developing
applications which perform asynchronous I/O — an event-driven
approach that can often achieve better scalability, performance, and
resource usage than conventional synchronous I/O. Unfortunately, Tokio
is notorio... (more…)
Read more »
My First Clippy Lint My First Clippy Lint Recently I wrote my first Clippy lint. It was much easier to
implement and test than I had expected. In … (more…)
Read more »
Facebook, Amazon, and Microsoft are forming new Rust teams, and Google has also been hiring former Mozilla engineers to build with the language. (more…)
Read more »
Well, at least, the NLL borrow checker finally got fully enabled by default - but that’s not as catchy of a title, is it? (more…)
Read more »