  {"id":7532,"date":"2018-03-26T16:53:09","date_gmt":"2018-03-26T20:53:09","guid":{"rendered":"https:\/\/digital.hbs.edu\/platform-digit\/submission\/crowdflower-powering-the-human-side-of-artificial-intelligence\/"},"modified":"2018-03-26T16:54:16","modified_gmt":"2018-03-26T20:54:16","slug":"crowdflower-powering-the-human-side-of-artificial-intelligence","status":"publish","type":"hck-submission","link":"https:\/\/d3.harvard.edu\/platform-digit\/submission\/crowdflower-powering-the-human-side-of-artificial-intelligence\/","title":{"rendered":"CrowdFlower: Powering the Human Side of Artificial Intelligence"},"content":{"rendered":"<p>Machine learning has a dirty secret. Despite all the buzzwords thrown around and the wonders that have been accomplished leveraging this technology, there\u2019s a very human element at the root of every application. Machine learning\u2019s secret is that to train a model, a real flesh and blood person had to tag, label, describe, or otherwise interact with massive amounts of data so that a model can be trained.<\/p>\n<p>For a good reason, the developing of this data is considered to be unengaging and unrewarding work. In HBO\u2019s show Silicon Valley where the grunt work of the tech industry is a running gag, Erlich, one of the show\u2019s main characters, attempts to trick a Stanford introductory computer science class into labeling millions of images of food with which to train his \u201cSee Food\u201d application. The application couldn\u2019t be built without the data, and the effort required to build up a meaningful dataset was something that was below his character\u2019s station as a titan of the technology industry. It\u2019s a commentary on a problem unique to machine learning. The data required to train the model is invaluable, it\u2019s extremely difficult to get, and yet it\u2019s considered lowly work.<\/p>\n<p style=\"text-align: center\"><iframe loading=\"lazy\" title=\"Silicon Valley: Professor Big Head (Season 4 Episode 4 Clip) | HBO\" width=\"640\" height=\"360\" src=\"https:\/\/www.youtube.com\/embed\/T0FA_69nXjM?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe><\/p>\n<p>A company that\u2019s been solving this problem is Crowdflower. Founded in 2007<a href=\"#_ftn1\" name=\"_ftnref1\">[1]<\/a><\/p>\n<p>, the company has been adding the human element to machine learning by crowdsourcing the construction and cleaning of massive datasets to power the machine learning revolution by leveraging a massive online workforce that answers questions that are easy for a human, but impossible for a state of the art machine learning model.<\/p>\n<h3><strong>CrowdFlower solves a complex problem in a scalable way.<\/strong><\/h3>\n<ol>\n<li>CrowdFlower continually audits its contributor base with \u201ctest\u201d questions to ensure that contributors are constantly providing accurate responses<a href=\"#_ftn1\" name=\"_ftnref1\">[2]<\/a>. Traditionally, this relies on using cross-validation amongst multiple contributors to ensure that there is a consistent and precise answer to a problem.<br \/>\n<a href=\"https:\/\/d3.harvard.edu\/platform-digit\/wp-content\/uploads\/sites\/2\/2018\/03\/bus.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-medium wp-image-7530 aligncenter\" src=\"https:\/\/d3.harvard.edu\/platform-digit\/wp-content\/uploads\/sites\/2\/2018\/03\/bus-239x300.png\" alt=\"\" width=\"239\" height=\"300\" srcset=\"https:\/\/d3.harvard.edu\/platform-digit\/wp-content\/uploads\/sites\/2\/2018\/03\/bus-239x300.png 239w, https:\/\/d3.harvard.edu\/platform-digit\/wp-content\/uploads\/sites\/2\/2018\/03\/bus-479x600.png 479w, https:\/\/d3.harvard.edu\/platform-digit\/wp-content\/uploads\/sites\/2\/2018\/03\/bus.png 718w\" sizes=\"auto, (max-width: 239px) 100vw, 239px\" \/><\/a><br \/>\nIf a person answers too many questions incorrectly, then they are removed from the system and can no longer participate in further tagging operations. Additionally, their previous results are thrown out.<\/li>\n<li>CrowdFlower enables companies to scale their operations up or down to match demand<a href=\"#_ftn1\" name=\"_ftnref1\">[3]<\/a>. Because the company runs so many projects with so many contributors, a company can receive fast project times when they need to build a dataset, but are not left with a significant amount of idle time when their tagging operations have completed. Organizing these large groups would be untenable for anyone but the largest data science application.<\/li>\n<li>Continuous training of datasets. Through its API, even when a model is deployed, developers can leverage the CrowdFlower platform in multiple innovative ways such as passing challenging to categorize cases to a real human in semi-real time, or auditing new data to ensure that the model is continuously performing correctly.<\/li>\n<\/ol>\n<p>CrowdFlower charges for access to its platform of humans that it uses to categorize the data. When a customer is on the platform, he or she can set a price for each unit of data they want to collect, and CrowdFlower then takes a commission of roughly 20%<a href=\"#_ftn1\" name=\"_ftnref1\">[4]<\/a>. The higher the complexity of each group of data, the more it will take to entice a member of the community to take the job. Additionally, the higher the price set for each unit of data, the quicker the project will get done because more people will be willing to take on the task. As in most things, customers are paying for a combination of speed, complexity, and volume. The more of each of those factors, the more a customer is going to pay.<\/p>\n<p>The company has had success thus far but is in an interesting dilemma. As the models they help train get better and better, the need for additional human intervention decreases. Tellingly, the better the models they help train become, and the more machine learning models can accomplish on their own, the less need there is for their services. While the total applications for machine learning are likely growing in the short-term and the demand for building datasets will likely continue to grow with it; eventually, they will have solved their way out of the solution they provide.<\/p>\n<p>&nbsp;<\/p>\n<p><a href=\"#_ftnref1\" name=\"_ftn1\">[1]<\/a> https:\/\/www.crunchbase.com\/organization\/crowdflower<\/p>\n<p><a href=\"#_ftnref1\" name=\"_ftn1\">[2]<\/a> https:\/\/success.crowdflower.com\/hc\/en-us\/articles\/208465816-Quality<\/p>\n<p><a href=\"#_ftnref1\" name=\"_ftn1\">[3]<\/a> https:\/\/success.crowdflower.com\/hc\/en-us\/articles\/202703355-Using-CrowdFlower-s-Internal-Channel-Option<\/p>\n<p><a href=\"#_ftnref1\" name=\"_ftn1\">[4]<\/a> https:\/\/success.crowdflower.com\/hc\/en-us\/articles\/202703165-Job-Costs-FAQ<\/p>\n","protected":false},"excerpt":{"rendered":"<p>CrowdFlower is solving machine learning&#039;s dirty secret &#8211; that humans are still needed to train the models we all rely on every day.<\/p>\n","protected":false},"author":2342,"featured_media":7541,"comment_status":"open","ping_status":"closed","template":"","categories":[29,2186,673,366],"class_list":["post-7532","hck-submission","type-hck-submission","status-publish","has-post-thumbnail","hentry","category-big-data","category-crowdflower","category-crowdsourcing","category-machine-learning","hck-taxonomy-organization-crowdflower","hck-taxonomy-industry-information-technology","hck-taxonomy-country-united-states"],"connected_submission_link":"https:\/\/d3.harvard.edu\/platform-digit\/assignment\/competing-with-or-against-crowds\/","yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.3 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>CrowdFlower: Powering the Human Side of Artificial Intelligence - Digital Innovation and Transformation<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/d3.harvard.edu\/platform-digit\/submission\/crowdflower-powering-the-human-side-of-artificial-intelligence\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"CrowdFlower: Powering the Human Side of Artificial Intelligence - Digital Innovation and Transformation\" \/>\n<meta property=\"og:description\" content=\"CrowdFlower is solving machine learning&#039;s dirty secret - that humans are still needed to train the models we all rely on every day.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/d3.harvard.edu\/platform-digit\/submission\/crowdflower-powering-the-human-side-of-artificial-intelligence\/\" \/>\n<meta property=\"og:site_name\" content=\"Digital Innovation and Transformation\" \/>\n<meta property=\"article:modified_time\" content=\"2018-03-26T20:54:16+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/d3.harvard.edu\/platform-digit\/wp-content\/uploads\/sites\/2\/2018\/03\/cf_logotype_signature.png\" \/>\n\t<meta property=\"og:image:width\" content=\"493\" \/>\n\t<meta property=\"og:image:height\" content=\"76\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/submission\\\/crowdflower-powering-the-human-side-of-artificial-intelligence\\\/\",\"url\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/submission\\\/crowdflower-powering-the-human-side-of-artificial-intelligence\\\/\",\"name\":\"CrowdFlower: Powering the Human Side of Artificial Intelligence - Digital Innovation and Transformation\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/submission\\\/crowdflower-powering-the-human-side-of-artificial-intelligence\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/submission\\\/crowdflower-powering-the-human-side-of-artificial-intelligence\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2018\\\/03\\\/cf_logotype_signature.png\",\"datePublished\":\"2018-03-26T20:53:09+00:00\",\"dateModified\":\"2018-03-26T20:54:16+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/submission\\\/crowdflower-powering-the-human-side-of-artificial-intelligence\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/submission\\\/crowdflower-powering-the-human-side-of-artificial-intelligence\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/submission\\\/crowdflower-powering-the-human-side-of-artificial-intelligence\\\/#primaryimage\",\"url\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2018\\\/03\\\/cf_logotype_signature.png\",\"contentUrl\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2018\\\/03\\\/cf_logotype_signature.png\",\"width\":493,\"height\":76},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/submission\\\/crowdflower-powering-the-human-side-of-artificial-intelligence\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Submissions\",\"item\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/submission\\\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"CrowdFlower: Powering the Human Side of Artificial Intelligence\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/#website\",\"url\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/\",\"name\":\"Digital Innovation and Transformation\",\"description\":\"MBA Student Perspectives\",\"potentialAction\":[{\"@type\":\"性视界Action\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"CrowdFlower: Powering the Human Side of Artificial Intelligence - Digital Innovation and Transformation","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/d3.harvard.edu\/platform-digit\/submission\/crowdflower-powering-the-human-side-of-artificial-intelligence\/","og_locale":"en_US","og_type":"article","og_title":"CrowdFlower: Powering the Human Side of Artificial Intelligence - Digital Innovation and Transformation","og_description":"CrowdFlower is solving machine learning&#039;s dirty secret - that humans are still needed to train the models we all rely on every day.","og_url":"https:\/\/d3.harvard.edu\/platform-digit\/submission\/crowdflower-powering-the-human-side-of-artificial-intelligence\/","og_site_name":"Digital Innovation and Transformation","article_modified_time":"2018-03-26T20:54:16+00:00","og_image":[{"width":493,"height":76,"url":"https:\/\/d3.harvard.edu\/platform-digit\/wp-content\/uploads\/sites\/2\/2018\/03\/cf_logotype_signature.png","type":"image\/png"}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/d3.harvard.edu\/platform-digit\/submission\/crowdflower-powering-the-human-side-of-artificial-intelligence\/","url":"https:\/\/d3.harvard.edu\/platform-digit\/submission\/crowdflower-powering-the-human-side-of-artificial-intelligence\/","name":"CrowdFlower: Powering the Human Side of Artificial Intelligence - Digital Innovation and Transformation","isPartOf":{"@id":"https:\/\/d3.harvard.edu\/platform-digit\/#website"},"primaryImageOfPage":{"@id":"https:\/\/d3.harvard.edu\/platform-digit\/submission\/crowdflower-powering-the-human-side-of-artificial-intelligence\/#primaryimage"},"image":{"@id":"https:\/\/d3.harvard.edu\/platform-digit\/submission\/crowdflower-powering-the-human-side-of-artificial-intelligence\/#primaryimage"},"thumbnailUrl":"https:\/\/d3.harvard.edu\/platform-digit\/wp-content\/uploads\/sites\/2\/2018\/03\/cf_logotype_signature.png","datePublished":"2018-03-26T20:53:09+00:00","dateModified":"2018-03-26T20:54:16+00:00","breadcrumb":{"@id":"https:\/\/d3.harvard.edu\/platform-digit\/submission\/crowdflower-powering-the-human-side-of-artificial-intelligence\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/d3.harvard.edu\/platform-digit\/submission\/crowdflower-powering-the-human-side-of-artificial-intelligence\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/d3.harvard.edu\/platform-digit\/submission\/crowdflower-powering-the-human-side-of-artificial-intelligence\/#primaryimage","url":"https:\/\/d3.harvard.edu\/platform-digit\/wp-content\/uploads\/sites\/2\/2018\/03\/cf_logotype_signature.png","contentUrl":"https:\/\/d3.harvard.edu\/platform-digit\/wp-content\/uploads\/sites\/2\/2018\/03\/cf_logotype_signature.png","width":493,"height":76},{"@type":"BreadcrumbList","@id":"https:\/\/d3.harvard.edu\/platform-digit\/submission\/crowdflower-powering-the-human-side-of-artificial-intelligence\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/d3.harvard.edu\/platform-digit\/"},{"@type":"ListItem","position":2,"name":"Submissions","item":"https:\/\/d3.harvard.edu\/platform-digit\/submission\/"},{"@type":"ListItem","position":3,"name":"CrowdFlower: Powering the Human Side of Artificial Intelligence"}]},{"@type":"WebSite","@id":"https:\/\/d3.harvard.edu\/platform-digit\/#website","url":"https:\/\/d3.harvard.edu\/platform-digit\/","name":"Digital Innovation and Transformation","description":"MBA Student Perspectives","potentialAction":[{"@type":"性视界Action","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/d3.harvard.edu\/platform-digit\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"}]}},"_links":{"self":[{"href":"https:\/\/d3.harvard.edu\/platform-digit\/wp-json\/wp\/v2\/hck-submission\/7532","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/d3.harvard.edu\/platform-digit\/wp-json\/wp\/v2\/hck-submission"}],"about":[{"href":"https:\/\/d3.harvard.edu\/platform-digit\/wp-json\/wp\/v2\/types\/hck-submission"}],"author":[{"embeddable":true,"href":"https:\/\/d3.harvard.edu\/platform-digit\/wp-json\/wp\/v2\/users\/2342"}],"replies":[{"embeddable":true,"href":"https:\/\/d3.harvard.edu\/platform-digit\/wp-json\/wp\/v2\/comments?post=7532"}],"version-history":[{"count":0,"href":"https:\/\/d3.harvard.edu\/platform-digit\/wp-json\/wp\/v2\/hck-submission\/7532\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/d3.harvard.edu\/platform-digit\/wp-json\/wp\/v2\/media\/7541"}],"wp:attachment":[{"href":"https:\/\/d3.harvard.edu\/platform-digit\/wp-json\/wp\/v2\/media?parent=7532"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/d3.harvard.edu\/platform-digit\/wp-json\/wp\/v2\/categories?post=7532"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}