  {"id":30854,"date":"2018-11-13T13:23:53","date_gmt":"2018-11-13T18:23:53","guid":{"rendered":"https:\/\/digital.hbs.edu\/platform-rctom\/submission\/sorting-and-securing-at-scale-machine-learning-at-dropbox\/"},"modified":"2018-11-13T13:23:53","modified_gmt":"2018-11-13T18:23:53","slug":"sorting-and-securing-at-scale-machine-learning-at-dropbox","status":"publish","type":"hck-submission","link":"https:\/\/d3.harvard.edu\/platform-rctom\/submission\/sorting-and-securing-at-scale-machine-learning-at-dropbox\/","title":{"rendered":"Sorting and securing at scale: machine learning at Dropbox"},"content":{"rendered":"<div><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">As the steward of hundreds of billions of documents belonging to over 500 million users<\/span> <span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z h-lbracket\">[1],<\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\"> Dropbox is starting to rely on machine learning more heavily than ever. Dropbox researchers have invested years of study into modern work and how people spend their time doing that work. Having found that people waste a significant amount of productive time on three specific activities \u2014 organization, contextualization, and prioritization<\/span> <span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z h-lbracket\">[2]<\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\"> \u2014 Dropbox views machine learning as a critical tool to help users avoid productivity potholes and maintain their focus.<\/span><\/div>\n<div><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">\u00a0<\/span><\/div>\n<div><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">This is objective is difficult to achieve because Dropbox needs to sift through and make sense of vast amounts of content while providing an experience tailored to each and every user. \u00a0For example, while most web search engines take into account a user\u2019s search habits<\/span> <span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z h-lparen\">(e.g.<\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\"> Google search history), Dropbox must go a step further to distinguish which documents should be available to each user <span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z h-lbracket\">[1]<\/span>.<\/span> <span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">Further complicating this task is the constantly changing nature of these documents. As a hub for creative collaboration<\/span> <span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z h-lbracket\">[3],<\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\"> documents in Dropbox are continuously being updated by multiple collaborators<\/span> <span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z h-lbracket\">[4].<\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\"> This dynamic process means that when Dropbox indexes a certain file with certain search criteria, within a few seconds those indexes may be rendered irrelevant due to edits by various users and new indexes must be assigned.\u00a0<\/span><\/div>\n<div><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">\u00a0<\/span><\/div>\n<div><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">Given these conditions, machine learning is going to become ever more important for Dropbox.\u00a0<\/span><\/div>\n<div><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">\u00a0<\/span><\/div>\n<div>\n<h2><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">How machine learning helps address these issues:<\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\"><br \/>\n<\/span><\/h2>\n<\/div>\n<div><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">Dropbox\u2019s first notable feature that harnessed machine learning was their document scanner.<\/span> <span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z h-ldquo\">\u201cMore<\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\"> than 20 billion image and PDF files have been stored in Dropbox, and of those, 10\u201320% are photos of documents\u201d<\/span> <span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z h-lbracket\">[5].<\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\"> Images of documents pose an issue because the text within them cannot be searched. As far as the computer is concerned this<\/span> <span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z h-ldquo\">\u201ctext\u201d<\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\"> is just a group of pixels, not text<\/span> <span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z h-lbracket\">[6].<\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\"> Machine learning offered a solution. Dropbox built an in-house Optical Character Recognition tool<\/span> <span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z h-lbracket\">[7]<\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\"> that leveraged machine learning to recognize, extract, and index the text in these images so that users can search for it.<\/span><\/div>\n<div><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">\u00a0<\/span><\/div>\n<div><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">In the short term, Dropbox is continuing to find ways to utilize machine learning in the features they build. For instance, they recently released a redesigned search engine called<\/span> <span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z h-ldquo\">\u201cNautilus\u201d<\/span> <span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z h-lbracket\">[7]<\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\"> which uses machine learning to solve the problems of search described previously. Farther down the line, Dropbox appears committed to investing in machine learning expertise and to making it a foundational component of the company. Dropbox\u2019s job page clearly illustrates this emphasis, with open listings for machine learning engineers, product managers, PhDs, as well as college interns<\/span> <span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z h-lbracket\">[8].<\/span><\/div>\n<div>\n<p><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\"><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">\u00a0<\/span><\/span><\/p>\n<figure id=\"attachment_30864\" aria-describedby=\"caption-attachment-30864\" style=\"width: 640px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/d3.harvard.edu\/platform-rctom\/wp-content\/uploads\/sites\/4\/2018\/11\/search-image-1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-large wp-image-30864\" src=\"https:\/\/d3.harvard.edu\/platform-rctom\/wp-content\/uploads\/sites\/4\/2018\/11\/search-image-1-1024x599.png\" alt=\"\" width=\"640\" height=\"374\" srcset=\"https:\/\/d3.harvard.edu\/platform-rctom\/wp-content\/uploads\/sites\/4\/2018\/11\/search-image-1-1024x599.png 1024w, https:\/\/d3.harvard.edu\/platform-rctom\/wp-content\/uploads\/sites\/4\/2018\/11\/search-image-1-300x175.png 300w, https:\/\/d3.harvard.edu\/platform-rctom\/wp-content\/uploads\/sites\/4\/2018\/11\/search-image-1-768x449.png 768w, https:\/\/d3.harvard.edu\/platform-rctom\/wp-content\/uploads\/sites\/4\/2018\/11\/search-image-1-600x351.png 600w, https:\/\/d3.harvard.edu\/platform-rctom\/wp-content\/uploads\/sites\/4\/2018\/11\/search-image-1.png 1300w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><\/a><figcaption id=\"caption-attachment-30864\" class=\"wp-caption-text\">性视界ing for text in an image. Source: https:\/\/blogs.dropbox.com\/dropbox\/2018\/10\/search-images-text-ocr\/<\/figcaption><\/figure>\n<\/div>\n<div>\n<h2><\/h2>\n<h2><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">Recommendations:<\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\"><br \/>\n<\/span><\/h2>\n<\/div>\n<div><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">A natural extension of their current use of machine learning is in securing user data. As the guardian of immense amounts of private user data, Dropbox is an attractive target for hackers. Machine learning, with its ability to rapidly process information at scale, could be expected to more quickly recognize and block suspicious entities attempting to accounts and, accordingly, should be pursued in the immediate and short term. According to <\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\"><i>The Times of Israel<\/i><\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">, Dropbox is hoping to shift the focus of their Tel Aviv team to security and machine learning, and also potentially acquire a startup to further this endeavor<\/span> <span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z h-lbracket\">[9].<\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\"> \u00a0In addition, I would advise them to investigate the possibilities of machine learning to help enhance individual document and data security processes, including, for example, in connection with password and verification best practices.<\/span><\/div>\n<div><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">\u00a0<\/span><\/div>\n<div><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">Machine learning relies on large datasets and constant learning opportunities in order to evolve and improve. While Dropbox\u2019s user base offers great scale, Dropbox\u2019s approach to product releases may delay its ability to provide the machine learning protocols with lots of learning opportunities. While some companies in the technology industry have been known for moving fast and iterating once a product is live<\/span> <span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z h-lparen\">(for<\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\"> example Facebook noted this strategy in their IPO filings<\/span> <span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z h-lbracket\">[10]),<\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\"> Dropbox is known for holding every product to a very high bar, only releasing it when majority of the kinks have been worked out.<\/span> <span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z h-lparen\">(This<\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\"> mindset is demonstrated in their values such as<\/span> <span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z h-ldquo\">\u201cSweat<\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\"> the details\u201d and<\/span> <span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z h-ldquo\">\u201cBe<\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\"> worthy of trust\u201d<\/span> <span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z h-lbracket\">[11]).<\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\"> This approach of only releasing products when they\u2019re highly evolved is in tension with machine learning\u2019s need to be exposed to lots of use before it can evolve. One way I recommend bridging this gap would be to look into using simulations of historic user activity in order to test and iterate with new features that rely on machine learning.\u00a0<\/span><\/div>\n<div><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">\u00a0<\/span><\/div>\n<div><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">However, changing the company\u2019s philosophy on product launches is a major strategic decision and far from a sure success. While this careful approach to product releases has served Dropbox well so far, will the benefits of machine learning push them to release products earlier in the development process? Moreover, with machine learning becoming a focus for many technology companies, how can Dropbox expect to outcompete the competition for machine learning talent?\u00a0<\/span><\/div>\n<div><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">\u00a0<\/span><\/div>\n<div><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">\u00a0<\/span><\/div>\n<div><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z s-lparen\">\u00a0<\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z h-lparen\">(800<\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\"> words)<\/span><\/div>\n<div><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">\u00a0<\/span><\/div>\n<div><span class=\"ace-all-bold-hthree\"><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\"><b>Sources:<\/b><\/span><\/span><\/div>\n<ol class=\"listtype-number listindent1 list-number1\" start=\"1\">\n<li><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">\u00a0<\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z url\"><a class=\"dynamiclink\" href=\"https:\/\/blogs.dropbox.com\/tech\/2018\/09\/architecture-of-nautilus-the-new-dropbox-search-engine\/\" target=\"_blank\" rel=\"noreferrer nofollow noopener\">https:\/\/blogs.dropbox.com\/tech\/2018\/09\/architecture-of-nautilus-the-new-dropbox-search-engine\/<\/a><\/span><\/li>\n<li><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z url\"><a class=\"dynamiclink\" href=\"https:\/\/blogs.dropbox.com\/tech\/2018\/09\/machine-intelligence-at-dropbox-an-update-from-our-dbxi-team\/\" target=\"_blank\" rel=\"noreferrer nofollow noopener\">https:\/\/blogs.dropbox.com\/tech\/2018\/09\/machine-intelligence-at-dropbox-an-update-from-our-dbxi-team\/<\/a><\/span><\/li>\n<li><span class=\"attrlink url author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\"><a class=\"attrlink\" href=\"http:\/\/www.dropbox.com\" target=\"_blank\" rel=\"noreferrer nofollow noopener\">www.dropbox.com\u00a0<\/a><\/span><\/li>\n<li><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">\u00a0<\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z url\"><a class=\"dynamiclink\" href=\"https:\/\/www.cbronline.com\/news\/dropbox-search-engine\" target=\"_blank\" rel=\"noreferrer nofollow noopener\">https:\/\/www.cbronline.com\/news\/dropbox-search-engine<\/a><\/span><\/li>\n<li><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">\u00a0<\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z url\"><a class=\"dynamiclink\" href=\"https:\/\/blogs.dropbox.com\/dropbox\/2018\/10\/search-images-text-ocr\/\" target=\"_blank\" rel=\"noreferrer nofollow noopener\">https:\/\/blogs.dropbox.com\/dropbox\/2018\/10\/search-images-text-ocr\/<\/a><\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">\u00a0<\/span><\/li>\n<li><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z url\"><a class=\"dynamiclink\" href=\"https:\/\/blogs.dropbox.com\/tech\/2017\/04\/creating-a-modern-ocr-pipeline-using-computer-vision-and-deep-learning\/\" target=\"_blank\" rel=\"noreferrer nofollow noopener\">https:\/\/blogs.dropbox.com\/tech\/2017\/04\/creating-a-modern-ocr-pipeline-using-computer-vision-and-deep-learning\/<\/a><\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">\u00a0<\/span><\/li>\n<li><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">\u00a0<\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z url\"><a class=\"dynamiclink\" href=\"https:\/\/venturebeat.com\/2018\/10\/09\/dropboxs-autoocr-can-index-text-from-pdfs-and-images\/\" target=\"_blank\" rel=\"noreferrer nofollow noopener\">https:\/\/venturebeat.com\/2018\/10\/09\/dropboxs-autoocr-can-index-text-from-pdfs-and-images\/<\/a><\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">\u00a0<\/span><\/li>\n<li><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">\u00a0<\/span><span class=\"attrlink url author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\"><a class=\"attrlink\" href=\"http:\/\/www.dropbox.com\/jobs\" target=\"_blank\" rel=\"noreferrer nofollow noopener\">www.dropbox.com\/jobs<\/a><\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">\u00a0<\/span><\/li>\n<li><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">\u00a0<\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z url\"><a class=\"dynamiclink\" href=\"https:\/\/www.timesofisrael.com\/dropbox-seeks-to-expand-operations-in-israel-possibly-acquire-startups\/\" target=\"_blank\" rel=\"noreferrer nofollow noopener\">https:\/\/www.timesofisrael.com\/dropbox-seeks-to-expand-operations-in-israel-possibly-acquire-startups\/<\/a><\/span><\/li>\n<li><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z url\"><a class=\"dynamiclink\" href=\"https:\/\/www.sec.gov\/Archives\/edgar\/data\/1326801\/000119312512034517\/d287954ds1.htm#toc287954_10\" target=\"_blank\" rel=\"noreferrer nofollow noopener\">https:\/\/www.sec.gov\/Archives\/edgar\/data\/1326801\/000119312512034517\/d287954ds1.htm#toc287954_10<\/a><\/span><\/li>\n<li><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z url\"><a class=\"dynamiclink\" href=\"https:\/\/www.sec.gov\/Archives\/edgar\/data\/1467623\/000119312518055809\/d451946ds1.htm\" target=\"_blank\" rel=\"noreferrer nofollow noopener\">https:\/\/www.sec.gov\/Archives\/edgar\/data\/1467623\/000119312518055809\/d451946ds1.htm<\/a><\/span><span class=\" author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlamz84zz86zkz65zz80zz85zz74z8z77zz71z5z90zz87zz82zqrz72z8lo1z73ze42z72z\">\u00a0<\/span><\/li>\n<\/ol>\n","protected":false},"excerpt":{"rendered":"<p>How Dropbox leverages machine learning in its quest to help users maintain focus.<\/p>\n","protected":false},"author":11156,"featured_media":30873,"comment_status":"open","ping_status":"closed","template":"","categories":[60,346,4571,588,548,212],"class_list":["post-30854","hck-submission","type-hck-submission","status-publish","has-post-thumbnail","hentry","category-data","category-machine-learning","category-ocr","category-scale","category-software-as-a-service","category-tech","hck-taxonomy-organization-dropbox","hck-taxonomy-industry-technology","hck-taxonomy-country-united-states"],"connected_submission_link":"https:\/\/d3.harvard.edu\/platform-rctom\/assignment\/rc-tom-challenge-2018\/","yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.3 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Sorting and securing at scale: machine learning at Dropbox - Technology and Operations Management<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/d3.harvard.edu\/platform-rctom\/submission\/sorting-and-securing-at-scale-machine-learning-at-dropbox\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Sorting and securing at scale: machine learning at Dropbox - Technology and Operations Management\" \/>\n<meta property=\"og:description\" content=\"How Dropbox leverages machine learning in its quest to help users maintain focus.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/d3.harvard.edu\/platform-rctom\/submission\/sorting-and-securing-at-scale-machine-learning-at-dropbox\/\" \/>\n<meta property=\"og:site_name\" content=\"Technology and Operations Management\" \/>\n<meta property=\"og:image\" content=\"https:\/\/d3.harvard.edu\/platform-rctom\/wp-content\/uploads\/sites\/4\/2018\/11\/nautilus-2x-wide2x-1.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1300\" \/>\n\t<meta property=\"og:image:height\" content=\"542\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/submission\\\/sorting-and-securing-at-scale-machine-learning-at-dropbox\\\/\",\"url\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/submission\\\/sorting-and-securing-at-scale-machine-learning-at-dropbox\\\/\",\"name\":\"Sorting and securing at scale: machine learning at Dropbox - Technology and Operations Management\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/submission\\\/sorting-and-securing-at-scale-machine-learning-at-dropbox\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/submission\\\/sorting-and-securing-at-scale-machine-learning-at-dropbox\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/wp-content\\\/uploads\\\/sites\\\/4\\\/2018\\\/11\\\/nautilus-2x-wide2x-1.png\",\"datePublished\":\"2018-11-13T18:23:53+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/submission\\\/sorting-and-securing-at-scale-machine-learning-at-dropbox\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/submission\\\/sorting-and-securing-at-scale-machine-learning-at-dropbox\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/submission\\\/sorting-and-securing-at-scale-machine-learning-at-dropbox\\\/#primaryimage\",\"url\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/wp-content\\\/uploads\\\/sites\\\/4\\\/2018\\\/11\\\/nautilus-2x-wide2x-1.png\",\"contentUrl\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/wp-content\\\/uploads\\\/sites\\\/4\\\/2018\\\/11\\\/nautilus-2x-wide2x-1.png\",\"width\":1300,\"height\":542},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/submission\\\/sorting-and-securing-at-scale-machine-learning-at-dropbox\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Submissions\",\"item\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/submission\\\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Sorting and securing at scale: machine learning at Dropbox\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/#website\",\"url\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/\",\"name\":\"Technology and Operations Management\",\"description\":\"MBA Student Perspectives\",\"potentialAction\":[{\"@type\":\"性视界Action\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Sorting and securing at scale: machine learning at Dropbox - Technology and Operations Management","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/d3.harvard.edu\/platform-rctom\/submission\/sorting-and-securing-at-scale-machine-learning-at-dropbox\/","og_locale":"en_US","og_type":"article","og_title":"Sorting and securing at scale: machine learning at Dropbox - Technology and Operations Management","og_description":"How Dropbox leverages machine learning in its quest to help users maintain focus.","og_url":"https:\/\/d3.harvard.edu\/platform-rctom\/submission\/sorting-and-securing-at-scale-machine-learning-at-dropbox\/","og_site_name":"Technology and Operations Management","og_image":[{"width":1300,"height":542,"url":"https:\/\/d3.harvard.edu\/platform-rctom\/wp-content\/uploads\/sites\/4\/2018\/11\/nautilus-2x-wide2x-1.png","type":"image\/png"}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/d3.harvard.edu\/platform-rctom\/submission\/sorting-and-securing-at-scale-machine-learning-at-dropbox\/","url":"https:\/\/d3.harvard.edu\/platform-rctom\/submission\/sorting-and-securing-at-scale-machine-learning-at-dropbox\/","name":"Sorting and securing at scale: machine learning at Dropbox - Technology and Operations Management","isPartOf":{"@id":"https:\/\/d3.harvard.edu\/platform-rctom\/#website"},"primaryImageOfPage":{"@id":"https:\/\/d3.harvard.edu\/platform-rctom\/submission\/sorting-and-securing-at-scale-machine-learning-at-dropbox\/#primaryimage"},"image":{"@id":"https:\/\/d3.harvard.edu\/platform-rctom\/submission\/sorting-and-securing-at-scale-machine-learning-at-dropbox\/#primaryimage"},"thumbnailUrl":"https:\/\/d3.harvard.edu\/platform-rctom\/wp-content\/uploads\/sites\/4\/2018\/11\/nautilus-2x-wide2x-1.png","datePublished":"2018-11-13T18:23:53+00:00","breadcrumb":{"@id":"https:\/\/d3.harvard.edu\/platform-rctom\/submission\/sorting-and-securing-at-scale-machine-learning-at-dropbox\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/d3.harvard.edu\/platform-rctom\/submission\/sorting-and-securing-at-scale-machine-learning-at-dropbox\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/d3.harvard.edu\/platform-rctom\/submission\/sorting-and-securing-at-scale-machine-learning-at-dropbox\/#primaryimage","url":"https:\/\/d3.harvard.edu\/platform-rctom\/wp-content\/uploads\/sites\/4\/2018\/11\/nautilus-2x-wide2x-1.png","contentUrl":"https:\/\/d3.harvard.edu\/platform-rctom\/wp-content\/uploads\/sites\/4\/2018\/11\/nautilus-2x-wide2x-1.png","width":1300,"height":542},{"@type":"BreadcrumbList","@id":"https:\/\/d3.harvard.edu\/platform-rctom\/submission\/sorting-and-securing-at-scale-machine-learning-at-dropbox\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/d3.harvard.edu\/platform-rctom\/"},{"@type":"ListItem","position":2,"name":"Submissions","item":"https:\/\/d3.harvard.edu\/platform-rctom\/submission\/"},{"@type":"ListItem","position":3,"name":"Sorting and securing at scale: machine learning at Dropbox"}]},{"@type":"WebSite","@id":"https:\/\/d3.harvard.edu\/platform-rctom\/#website","url":"https:\/\/d3.harvard.edu\/platform-rctom\/","name":"Technology and Operations Management","description":"MBA Student Perspectives","potentialAction":[{"@type":"性视界Action","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/d3.harvard.edu\/platform-rctom\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"}]}},"_links":{"self":[{"href":"https:\/\/d3.harvard.edu\/platform-rctom\/wp-json\/wp\/v2\/hck-submission\/30854","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/d3.harvard.edu\/platform-rctom\/wp-json\/wp\/v2\/hck-submission"}],"about":[{"href":"https:\/\/d3.harvard.edu\/platform-rctom\/wp-json\/wp\/v2\/types\/hck-submission"}],"author":[{"embeddable":true,"href":"https:\/\/d3.harvard.edu\/platform-rctom\/wp-json\/wp\/v2\/users\/11156"}],"replies":[{"embeddable":true,"href":"https:\/\/d3.harvard.edu\/platform-rctom\/wp-json\/wp\/v2\/comments?post=30854"}],"version-history":[{"count":0,"href":"https:\/\/d3.harvard.edu\/platform-rctom\/wp-json\/wp\/v2\/hck-submission\/30854\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/d3.harvard.edu\/platform-rctom\/wp-json\/wp\/v2\/media\/30873"}],"wp:attachment":[{"href":"https:\/\/d3.harvard.edu\/platform-rctom\/wp-json\/wp\/v2\/media?parent=30854"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/d3.harvard.edu\/platform-rctom\/wp-json\/wp\/v2\/categories?post=30854"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}