{"id":34815,"date":"2018-11-13T18:36:22","date_gmt":"2018-11-13T23:36:22","guid":{"rendered":"https:\/\/digital.hbs.edu\/platform-rctom\/submission\/leveraging-machine-learning-to-reduce-spam-on-twitter\/"},"modified":"2018-11-13T18:36:22","modified_gmt":"2018-11-13T23:36:22","slug":"leveraging-machine-learning-to-reduce-spam-on-twitter","status":"publish","type":"hck-submission","link":"https:\/\/d3.harvard.edu\/platform-rctom\/submission\/leveraging-machine-learning-to-reduce-spam-on-twitter\/","title":{"rendered":"Leveraging Machine Learning to Reduce Spam on Twitter"},"content":{"rendered":"<p>Twitter, a popular online social networking site, facilitates communication of ideas among individuals, companies, and organizations. Founded in 2006, Twitter now has over 300 million active users worldwide [1]. In a nutshell, Twitter is a social media-based communication platform on which users exchange \u201ctweets,\u201d short messages (280 characters or fewer) capturing ideas, news, and reactions. The platform processes 6,000 messages per second or nearly 200 billion per year.<\/p>\n<p>Given such a high volume of messages, content is largely unfiltered. In fact, Twitter\u2019s business model is built upon real-time communication and so any potential review system would impede on posting speed and run counter to the site\u2019s purpose. The lack of filtering has lead to a rise in spam and automatically generated content. Prior to implementing detection mechanisms, Twitter relied on users reporting incidents of suspected spam. Users submit thousands of spam reports daily, identifying sources of irrelevant or inappropriate content with the intention of disabling spam accounts.<\/p>\n<p>As Twitter mentions, \u201cinauthentic accounts, spam, and malicious automation disrupt everyone\u2019s experience on Twitter.\u201d The prevalence of spam and its negative impact on users has emerged as a top priority for Twitter, as its corporate valuation (as with similar internet-based companies) depends on the size and engagement level of its user base and dissatisfied users could leave the site or become less active. Advances in machine learning have presented an opportunity for Twitter to leverage computer processing power and known user trends to identify and automatically disable accounts contributing spam. As such, Twitter has invested in machine learning to address this problem. As a company whose \u201cproduct\u201d is a robust base of news and thoughts, spam directly threatens the site\u2019s value proposition and action against spam is necessary for its survival and success.<\/p>\n<p>In a recent memo, Twitter reaffirmed its commitment to ensuring content shared was reliable, trustworthy, and relevant. The rampant incidence of spam motivated Twitter\u2019s decision to invest in machine learning technology to automatically identify and take action against accounts producing spam [2]. This approach fights spam proactively, rather than waiting to receive and verify suspected reports of spam submitted by users. Since implementation last year, Twitter has been successful in identifying 3x more spam accounts (10 million per month as of May) [2]. Consequently, user-submitted spam reports have declined from 25,000 in March to 17,000 in May.<\/p>\n<p>Research conducted by Alex H. Wang at Pennsylvania State University describes the mechanism behind Twitter\u2019s approach to spam detection. His research paper entitled \u201cDetecting Spam Bots in Online Social Networking Sites: A Machine Learning Approach\u201d highlights the numerical parameters algorithms leverage to identify \u201cspammy\u201d accounts. The algorithm extracts the number and the relationships among the user\u2019s friends and followers to evaluate the user\u2019s authenticity. According to Mr. Wang, this machine learning approach is \u201can efficient and accurate to identify spam bots in Twitter\u201d [3].<\/p>\n<p>Twitter has implemented this machine learning approach since 2017 and continues to see positive results. In the near term, Twitter expects to continue calibrating and improving its machine learning algorithms to improve spam detection. According to a research report from Deakin University in Australia, Twitter\u2019s current methods and techniques achieve an accuracy rate of approximately 80% [4]. While quite successful already, Twitter can further increase its accuracy rate toward 100%. Presently, in 20% of cases, Twitter\u2019s algorithms incorrectly label accounts as spam and interrupt their ability to contribute content. Over the next few years, it is reasonable to expect that Twitter will close the accuracy gap and elevate the quality of its machine learning algorithm to achieve accuracy rates closer to 100%.<\/p>\n<p>Creating a Twitter account involves specifying a name and phone number or e-mail address, which is then verified via a code sent to the user. Sophisticated bots can generate fake phone numbers and e-mail addresses and successfully create Twitter accounts. As a suggestion, Twitter could adopt Google\u2019s reCAPTCHA technology, which requires users to decipher blurry words or phrases and enter them to proceed with making accounts. While not insurmountable, unsophisticated bots struggle with passing through the reCAPTCHA step and thus would be prevented from creating a spam account [5].<\/p>\n<p><strong>Question for further discussion:<\/strong> What is the customer impact of falsely identifying a Twitter account as spam and suspending it?<\/p>\n<p>Word Count: 702<\/p>\n<p><strong>Sources<\/strong><\/p>\n<p>[1] Twitter, Inc. Annual Report 2018 (February 2018). <a href=\"http:\/\/www.viewproxy.com\/Twitter\/2018\/AnnualReport2017.pdf\">http:\/\/www.viewproxy.com\/Twitter\/2018\/AnnualReport2017.pdf<\/a>. Accessed November 13, 2018.<\/p>\n<p>[2] Twitter, Inc. \u201cHow Twitter is Fighting Spam and Malicious Automation\u201d (June 2018). <a href=\"https:\/\/blog.twitter.com\/official\/en_us\/topics\/company\/2018\/how-twitter-is-fighting-spam-and-malicious-automation.html\">https:\/\/blog.twitter.com\/official\/en_us\/topics\/company\/2018\/how-twitter-is-fighting-spam-and-malicious-automation.html<\/a> Accessed November 13, 2018.<\/p>\n<p>[3] Wang, Alex Hai. <a href=\"https:\/\/link.springer.com\/content\/pdf\/10.1007%2F978-3-642-13739-6_25.pdf\">Detecting Spam Bots in Online Social Networking Sites: A Machine Learning Approach.<\/a> (June 2010). <em>Data Applications: Security and Privacy<\/em>. Accessed November 13, 2018.<\/p>\n<p>[4] Wu, Tingmin et. al. <a href=\"https:\/\/dl.acm.org\/citation.cfm?id=3014815\">Twitter Spam Detection Based on Deep Learning.<\/a> (February 2017). <em>Australasian Computer Science Week.<\/em> Accessed November 13, 2018.<\/p>\n<p>[5] Beede, Rodney. <a href=\"https:\/\/www.rodneybeede.com\/downloads\/CSCI5722%20-%20Computer%20Vision,%20Final%20Paper,%20Rodney%20Beede,%20Fall%202010.pdf\">Analysis of reCAPTCHA Effectiveness.<\/a> (December 2010). <em>Computer Vision.<\/em> Accessed November 13, 2018.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Engineers at Twitter have leveraged machine learning techniques to reduce the incidence of spam on the popular social networking site.<\/p>\n","protected":false},"author":11522,"featured_media":34835,"comment_status":"open","ping_status":"closed","template":"","categories":[346,416,4979,75],"class_list":["post-34815","hck-submission","type-hck-submission","status-publish","has-post-thumbnail","hentry","category-machine-learning","category-social-media","category-spam","category-twitter","hck-taxonomy-organization-twitter","hck-taxonomy-industry-technology","hck-taxonomy-country-united-states"],"connected_submission_link":"https:\/\/d3.harvard.edu\/platform-rctom\/assignment\/rc-tom-challenge-2018\/","yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.3 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Leveraging Machine Learning to Reduce Spam on Twitter - Technology and Operations Management<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/d3.harvard.edu\/platform-rctom\/submission\/leveraging-machine-learning-to-reduce-spam-on-twitter\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Leveraging Machine Learning to Reduce Spam on Twitter - Technology and Operations Management\" \/>\n<meta property=\"og:description\" content=\"Engineers at Twitter have leveraged machine learning techniques to reduce the incidence of spam on the popular social networking site.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/d3.harvard.edu\/platform-rctom\/submission\/leveraging-machine-learning-to-reduce-spam-on-twitter\/\" \/>\n<meta property=\"og:site_name\" content=\"Technology and Operations Management\" \/>\n<meta property=\"og:image\" content=\"https:\/\/d3.harvard.edu\/platform-rctom\/wp-content\/uploads\/sites\/4\/2018\/11\/Twitter-3.jpeg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/submission\\\/leveraging-machine-learning-to-reduce-spam-on-twitter\\\/\",\"url\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/submission\\\/leveraging-machine-learning-to-reduce-spam-on-twitter\\\/\",\"name\":\"Leveraging Machine Learning to Reduce Spam on Twitter - Technology and Operations Management\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/submission\\\/leveraging-machine-learning-to-reduce-spam-on-twitter\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/submission\\\/leveraging-machine-learning-to-reduce-spam-on-twitter\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/wp-content\\\/uploads\\\/sites\\\/4\\\/2018\\\/11\\\/Twitter-3.jpeg\",\"datePublished\":\"2018-11-13T23:36:22+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/submission\\\/leveraging-machine-learning-to-reduce-spam-on-twitter\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/submission\\\/leveraging-machine-learning-to-reduce-spam-on-twitter\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/submission\\\/leveraging-machine-learning-to-reduce-spam-on-twitter\\\/#primaryimage\",\"url\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/wp-content\\\/uploads\\\/sites\\\/4\\\/2018\\\/11\\\/Twitter-3.jpeg\",\"contentUrl\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/wp-content\\\/uploads\\\/sites\\\/4\\\/2018\\\/11\\\/Twitter-3.jpeg\",\"width\":1200,\"height\":628},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/submission\\\/leveraging-machine-learning-to-reduce-spam-on-twitter\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Submissions\",\"item\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/submission\\\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Leveraging Machine Learning to Reduce Spam on Twitter\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/#website\",\"url\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/\",\"name\":\"Technology and Operations Management\",\"description\":\"MBA Student Perspectives\",\"potentialAction\":[{\"@type\":\"性视界Action\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-rctom\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Leveraging Machine Learning to Reduce Spam on Twitter - Technology and Operations Management","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/d3.harvard.edu\/platform-rctom\/submission\/leveraging-machine-learning-to-reduce-spam-on-twitter\/","og_locale":"en_US","og_type":"article","og_title":"Leveraging Machine Learning to Reduce Spam on Twitter - Technology and Operations Management","og_description":"Engineers at Twitter have leveraged machine learning techniques to reduce the incidence of spam on the popular social networking site.","og_url":"https:\/\/d3.harvard.edu\/platform-rctom\/submission\/leveraging-machine-learning-to-reduce-spam-on-twitter\/","og_site_name":"Technology and Operations Management","og_image":[{"width":1200,"height":628,"url":"https:\/\/d3.harvard.edu\/platform-rctom\/wp-content\/uploads\/sites\/4\/2018\/11\/Twitter-3.jpeg","type":"image\/jpeg"}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/d3.harvard.edu\/platform-rctom\/submission\/leveraging-machine-learning-to-reduce-spam-on-twitter\/","url":"https:\/\/d3.harvard.edu\/platform-rctom\/submission\/leveraging-machine-learning-to-reduce-spam-on-twitter\/","name":"Leveraging Machine Learning to Reduce Spam on Twitter - Technology and Operations Management","isPartOf":{"@id":"https:\/\/d3.harvard.edu\/platform-rctom\/#website"},"primaryImageOfPage":{"@id":"https:\/\/d3.harvard.edu\/platform-rctom\/submission\/leveraging-machine-learning-to-reduce-spam-on-twitter\/#primaryimage"},"image":{"@id":"https:\/\/d3.harvard.edu\/platform-rctom\/submission\/leveraging-machine-learning-to-reduce-spam-on-twitter\/#primaryimage"},"thumbnailUrl":"https:\/\/d3.harvard.edu\/platform-rctom\/wp-content\/uploads\/sites\/4\/2018\/11\/Twitter-3.jpeg","datePublished":"2018-11-13T23:36:22+00:00","breadcrumb":{"@id":"https:\/\/d3.harvard.edu\/platform-rctom\/submission\/leveraging-machine-learning-to-reduce-spam-on-twitter\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/d3.harvard.edu\/platform-rctom\/submission\/leveraging-machine-learning-to-reduce-spam-on-twitter\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/d3.harvard.edu\/platform-rctom\/submission\/leveraging-machine-learning-to-reduce-spam-on-twitter\/#primaryimage","url":"https:\/\/d3.harvard.edu\/platform-rctom\/wp-content\/uploads\/sites\/4\/2018\/11\/Twitter-3.jpeg","contentUrl":"https:\/\/d3.harvard.edu\/platform-rctom\/wp-content\/uploads\/sites\/4\/2018\/11\/Twitter-3.jpeg","width":1200,"height":628},{"@type":"BreadcrumbList","@id":"https:\/\/d3.harvard.edu\/platform-rctom\/submission\/leveraging-machine-learning-to-reduce-spam-on-twitter\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/d3.harvard.edu\/platform-rctom\/"},{"@type":"ListItem","position":2,"name":"Submissions","item":"https:\/\/d3.harvard.edu\/platform-rctom\/submission\/"},{"@type":"ListItem","position":3,"name":"Leveraging Machine Learning to Reduce Spam on Twitter"}]},{"@type":"WebSite","@id":"https:\/\/d3.harvard.edu\/platform-rctom\/#website","url":"https:\/\/d3.harvard.edu\/platform-rctom\/","name":"Technology and Operations Management","description":"MBA Student Perspectives","potentialAction":[{"@type":"性视界Action","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/d3.harvard.edu\/platform-rctom\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"}]}},"_links":{"self":[{"href":"https:\/\/d3.harvard.edu\/platform-rctom\/wp-json\/wp\/v2\/hck-submission\/34815","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/d3.harvard.edu\/platform-rctom\/wp-json\/wp\/v2\/hck-submission"}],"about":[{"href":"https:\/\/d3.harvard.edu\/platform-rctom\/wp-json\/wp\/v2\/types\/hck-submission"}],"author":[{"embeddable":true,"href":"https:\/\/d3.harvard.edu\/platform-rctom\/wp-json\/wp\/v2\/users\/11522"}],"replies":[{"embeddable":true,"href":"https:\/\/d3.harvard.edu\/platform-rctom\/wp-json\/wp\/v2\/comments?post=34815"}],"version-history":[{"count":0,"href":"https:\/\/d3.harvard.edu\/platform-rctom\/wp-json\/wp\/v2\/hck-submission\/34815\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/d3.harvard.edu\/platform-rctom\/wp-json\/wp\/v2\/media\/34835"}],"wp:attachment":[{"href":"https:\/\/d3.harvard.edu\/platform-rctom\/wp-json\/wp\/v2\/media?parent=34815"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/d3.harvard.edu\/platform-rctom\/wp-json\/wp\/v2\/categories?post=34815"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}