Leveraging Human Intelligence For Better AI At Alegion With Cheryl Martin - Episode 38


July 2nd, 2018

46 mins 13 secs

Your Host

About this Episode


Data is often messy or incomplete, requiring human intervention to make sense of it before being usable as input to machine learning projects. This is problematic when the volume scales beyond a handful of records. In this episode Dr. Cheryl Martin, Chief Data Scientist for Alegion, discusses the importance of properly labeled information for machine learning and artificial intelligence projects, the systems that they have built to scale the process of incorporating human intelligence in the data preparation process, and the challenges inherent to such an endeavor.


  • Your host is Tobias Macey and today I’m interviewing Cheryl Martin, chief data scientist at Alegion, about data labelling at scale


  • Introduction
  • How did you get involved in the area of data management?
  • To start, can you explain the problem space that Alegion is targeting and how you operate?
  • When is it necessary to include human intelligence as part of the data lifecycle for ML/AI projects?
  • What are some of the biggest challenges associated with managing human input to data sets intended for machine usage?
  • For someone who is acting as human-intelligence provider as part of the workforce, what does their workflow look like?
    • What tools and processes do you have in place to ensure the accuracy of their inputs?
    • How do you prevent bad actors from contributing data that would compromise the trained model?

  • What are the limitations of crowd-sourced data labels?

    • When is it beneficial to incorporate domain experts in the process?

  • When doing data collection from various sources, how do you ensure that intellectual property rights are respected?

  • How do you determine the taxonomies to be used for structuring data sets that are collected, labeled or enriched for your customers?

    • What kinds of metadata do you track and how is that recorded/transmitted?

  • Do you think that human intelligence will be a necessary piece of ML/AI forever?

Contact Info

Parting Question

  • From your perspective, what is the biggest gap in the tooling or technology for data management today?


