Crowdsourcing provides a scalable and efficient way to construct labeled datasets for training machine learning systems. However, creating comprehensive label guidelines for crowdworkers is often prohibitive even for seemingly simple concepts. Incomplete ... (more…)
Read more »
In the context of science, the well-known adage "a picture is worth a
thousand words" might well be "a model is worth a thousand datasets."
Scientific models, such as Newtonian physics or biological gene regulatory
networks, are human-driven simplificatio... (more…)
Read more »
In this post, I’m going to walk you through the process of setting up machine learning pipeline within RavenDB. The first thing to ask, of course, is what am... (more…)
Read more »
One of the main challenges of today's Machine Learning initiatives is the need for a centralized store of high-quality data that can be reused by Data Scientists across different models. Amundsen is a data discovery tool that collects metadata from your d... (more…)
Read more »
Enterprise Search for complex technical disciplines remains a major challenge. The tools that work so well in web search - social clues, inbound links, and s... (more…)
Read more »