  {"id":15450,"date":"2022-04-15T10:29:31","date_gmt":"2022-04-15T14:29:31","guid":{"rendered":"https:\/\/digital.hbs.edu\/platform-digit\/?post_type=hck-submission&#038;p=15450"},"modified":"2022-04-22T19:44:16","modified_gmt":"2022-04-22T23:44:16","slug":"hugging-face-embracing-natural-language-processing","status":"publish","type":"hck-submission","link":"https:\/\/d3.harvard.edu\/platform-digit\/submission\/hugging-face-embracing-natural-language-processing\/","title":{"rendered":"Hugging Face: Embracing Natural Language Processing"},"content":{"rendered":"\n\n\n<p>Don\u2019t be fooled by the friendly emoji in the company\u2019s actual name \u2014 HuggingFace means business. What started out in 2016 as a humble chatbot company <a href=\"https:\/\/techcrunch.com\/2018\/05\/23\/hugging-face-raises-4-million-for-its-artificial-bff\/\">with investors like Kevin Durant<\/a> has become a a central provider of open-source natural language processing (NLP) infrastructure for the AI community. HuggingFace boasts an impressive list of users, <a href=\"https:\/\/huggingface.co\/#tech\">including the big four<\/a> of the AI world (Facebook, Google, Microsoft, and Amazon). What\u2019s most surprising is that, despite their completely open source business model, HuggingFace has been cash-flow positive and maintains a staff of under 100 people. This blogpost will describe the basics of their business model and attempt to explain how they\u2019ve accomplished so much with so little.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Value Creation<\/h2>\n\n\n\n<p>HuggingFace\u2019s core product is an easy-to-use NLP modeling library. The library, Transformers, is both free and <a rel=\"noreferrer noopener\" href=\"https:\/\/huggingface.co\/docs\/transformers\/pipeline_tutorial\" target=\"_blank\">ridicuously easy to use<\/a>. With as few as three lines of code, you could be using cutting-edge NLP models like <a rel=\"noreferrer noopener\" href=\"https:\/\/huggingface.co\/bert-base-uncased\" target=\"_blank\">BERT<\/a> or <a rel=\"noreferrer noopener\" href=\"https:\/\/huggingface.co\/gpt2\" target=\"_blank\">GPT2<\/a> to generate text, answer questions, summarize larger bodies of text, or any other number of standard NLP tasks. The library is fully compatible with popular deep learning frameworks like PyTorch and Tensorflow. Furthermore, the library also provides simple hooks for custom training or fine-tuning of existing models.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/jalammar.github.io\/images\/t\/transformer_resideual_layer_norm_3.png\" alt=\"transformer_resideual_layer_norm_3.png (1415\u00d7804)\" \/><figcaption><em>The conceptual architecture behind Transformer models. To learn more about Transformer Architectures, see <a href=\"https:\/\/jalammar.github.io\/illustrated-transformer\/\">this amazing blogpost<\/a>.<\/em><\/figcaption><\/figure>\n\n\n\n<p>Amazingly, HuggingFace does not charge for their core product; rather, they open-source their core library, providing it at zero charge. The models and core library are all <a rel=\"noreferrer noopener\" href=\"https:\/\/github.com\/huggingface\" target=\"_blank\">available on Github<\/a> under the Apache License 2.0, an extremely permissive license that allows others to build on and capture value from their work without condition. The company is active in responding to technical issues encountered by its users, and generally seems to have a goal of promoting as much adoption of their models as possible.<\/p>\n\n\n\n<p>The core value of HuggingFace is comes from distilling the work of the broader research community and making it accessible via thoughtful tool design. HuggingFace does not (for the most part) research most of its own models, but rather builds on the research of others. Importantly, the research community has a norm of sharing the product of research as open-source code as well, which enables HuggingFace to do this at extremely low cost. HuggingFace <a rel=\"noreferrer noopener\" href=\"https:\/\/dagshub.com\/blog\/huggingface-cto-julien-chaumond-on-large-models-in-production\/#:~:text=So%20if%20you%20can%20emphasize%20that%20API%20design%20is%20an%20important%20part%20and%20design%2C%20in%20general%2C%20is%20an%20important%20part%20of%20user%20experience%2C%20developer%20experience%2C%20and%20it%27s%20going%20to%20make%20your%20life%20way%20easier%20in%20the%20long%20term\" target=\"_blank\">spends a lot of effort<\/a> on the sofware design that makes their models accessible to others; the heavy focus on UX is a big reason for their popularity in the research community.<\/p>\n\n\n\n<p>Beyond their core products, HuggingFace is extremely embedded within the NLP research community, and uses that position to create additional value. HF <a rel=\"noreferrer noopener\" href=\"https:\/\/marksaroufim.substack.com\/p\/huggingface?s=r\" target=\"_blank\">organizes a large community of users<\/a> who share the company\u2019s norms around openness. They <a rel=\"noreferrer noopener\" href=\"https:\/\/dagshub.com\/blog\/huggingface-cto-julien-chaumond-on-large-models-in-production\/#:~:text=But%20in%20the%20open%2C%20as%20part%20of%20an%20opened%2C%20wide%20reaching%2Dcollaboration%20with%20the%20largest%20possible%20number%20of%20institutions%20like%20companies%20and%20universities\" target=\"_blank\">collaborate with universities and larger companies<\/a> on research papers. They\u2019ve coordinated with large MLOps Infrastructure providers to ensure their service is available on the main cloud computing services (e.g. <a rel=\"noreferrer noopener\" href=\"https:\/\/docs.aws.amazon.com\/sagemaker\/latest\/dg\/hugging-face.html\" target=\"_blank\">AWS SageMaker<\/a>). One PhD researcher who I\u2019ve spoken with went as far as to say \u201cI don\u2019t really know how I\u2019d do [big-model] NLP research without HuggingFace\u201d.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Value Capture<\/h2>\n\n\n\n<p>Amazingly, <a rel=\"noreferrer noopener\" href=\"https:\/\/venturebeat.com\/2021\/03\/11\/hugging-face-triples-investment-in-open-source-machine-learning-models\/#:~:text=the%20company%20has%20been%20cash%2Dpositive%20in%20the%20first%20months%20of%202021\" target=\"_blank\">the company has been cash-flow positive<\/a> for over a year. The company does this by providing <a rel=\"noreferrer noopener\" href=\"https:\/\/huggingface.co\/pricing\" target=\"_blank\">consulting and infrastructural services<\/a> to aid in the use and application of their product. In particular, the company\u2019s specialty in of operating large language models enables them to <a rel=\"noreferrer noopener\" href=\"https:\/\/dagshub.com\/blog\/huggingface-cto-julien-chaumond-on-large-models-in-production\/#:~:text=we%20help%20a%20lot%20of%20companies%20actually%20on%20exactly%20this%20kind%20of%20workload%20like%20scaling%20BERT%20type%20inference\" target=\"_blank\">collaborate with companies<\/a> to help them to run efficiently at scale. Demand for this type of service has exploded this past year with a sharp <a rel=\"noreferrer noopener\" href=\"https:\/\/dagshub.com\/blog\/huggingface-cto-julien-chaumond-on-large-models-in-production\/#:~:text=Under%20the%20production%20side%2C%20I%20would%20say%20definitely%20classification.%20A%20lot%20of%20actual%20use%20cases%20in%20organizations%20around%20document%20classifications%20or%20token%20classification%20to%20do%20information%20extraction\" target=\"_blank\">rise in the demand for document classification<\/a> by many organization.<\/p>\n\n\n\n<p>HuggingFace is effectively <a rel=\"noreferrer noopener\" href=\"https:\/\/dagshub.com\/blog\/huggingface-cto-julien-chaumond-on-large-models-in-production\/#:~:text=What%20I%20would,open%20source%20companies\" target=\"_blank\">pioneering a new business model<\/a>, pushing the business models of AI away from capturing value from models directly, and towards capturing value from the complementary products and processes necessary for deploying them. The company is betting on machine learning as being as important in the future as software engineering is today (<a rel=\"noreferrer noopener\" href=\"https:\/\/dagshub.com\/blog\/huggingface-cto-julien-chaumond-on-large-models-in-production\/#:~:text=We%20expect%20that%20machine%20learning%20is%20going%20to%20be%20as%20big%20or%20bigger%20than%20software%20engineering%20in%205%20years\" target=\"_blank\">source<\/a>).<\/p>\n\n\n\n<p>Is this a sustainable business model? It\u2019s hard to argue with the results. The core reason they are profitable is that they have extremely low costs relative to the value that they are creating. The company successfully <a rel=\"noreferrer noopener\" href=\"https:\/\/techcrunch.com\/2021\/03\/11\/hugging-face-raises-40-million-for-its-natural-language-processing-library\/?guccounter=1&amp;guce_referrer=aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbS8&amp;guce_referrer_sig=AQAAAJe51RRPirZj-PuS9XNZW7k_7xIPo3z-6JC7UkkoNxGTXf40_zJMsWyNuHlndyM6QEBvRVKNIbbc3sPsHqxeLXqQNFRyf_2Fm3bzTlWK79exKYLNhe6Y6nJV0kYhMVavZuuTDKLqjpXzRPyltnpT9fMBN7k6Vmrgav-lC_TXll_O\" target=\"_blank\">raised a Series B round in early last year<\/a> to grow the size of their team, <a rel=\"noreferrer noopener\" href=\"https:\/\/venturebeat.com\/2021\/03\/11\/hugging-face-triples-investment-in-open-source-machine-learning-models\/#:~:text=%E2%80%9CWe%E2%80%99ve%20always%20had%20acquisition%20interests%20from%20Big%20Tech%20and%20others%2C%20but%20we%20believe%20it%E2%80%99s%20good%20to%20have%20independent%20companies%20%E2%80%94%20that%E2%80%99s%20what%20we%E2%80%99re%20trying%20to%20do.%E2%80%9D\" target=\"_blank\">resisting acquisition interest<\/a> from the big tech companies. It seems fairly clear, though, that they\u2019re leaving tremendous value to be captured by others, especially those providing the technical infrastructured necessary for AI services. However, their openness does seem to generate a lot of benefit for our society. For that reason, HuggingFace deserves a big hug.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Learn how the leading provider of large language models does it with a completely open source business model<\/p>\n","protected":false},"author":19172,"featured_media":15452,"comment_status":"open","ping_status":"closed","template":"","categories":[],"class_list":["post-15450","hck-submission","type-hck-submission","status-publish","has-post-thumbnail","hentry"],"connected_submission_link":"https:\/\/d3.harvard.edu\/platform-digit\/assignment\/machine-learning-2\/","yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.3 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Hugging Face: Embracing Natural Language Processing - Digital Innovation and Transformation<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/d3.harvard.edu\/platform-digit\/submission\/hugging-face-embracing-natural-language-processing\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Hugging Face: Embracing Natural Language Processing - Digital Innovation and Transformation\" \/>\n<meta property=\"og:description\" content=\"Learn how the leading provider of large language models does it with a completely open source business model\" \/>\n<meta property=\"og:url\" content=\"https:\/\/d3.harvard.edu\/platform-digit\/submission\/hugging-face-embracing-natural-language-processing\/\" \/>\n<meta property=\"og:site_name\" content=\"Digital Innovation and Transformation\" \/>\n<meta property=\"article:modified_time\" content=\"2022-04-22T23:44:16+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/d3.harvard.edu\/platform-digit\/wp-content\/uploads\/sites\/2\/2022\/04\/demo-huggingface_optimized-e1650032899960.png\" \/>\n\t<meta property=\"og:image:width\" content=\"370\" \/>\n\t<meta property=\"og:image:height\" content=\"114\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/submission\\\/hugging-face-embracing-natural-language-processing\\\/\",\"url\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/submission\\\/hugging-face-embracing-natural-language-processing\\\/\",\"name\":\"Hugging Face: Embracing Natural Language Processing - Digital Innovation and Transformation\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/submission\\\/hugging-face-embracing-natural-language-processing\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/submission\\\/hugging-face-embracing-natural-language-processing\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2022\\\/04\\\/demo-huggingface_optimized-e1650032899960.png\",\"datePublished\":\"2022-04-15T14:29:31+00:00\",\"dateModified\":\"2022-04-22T23:44:16+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/submission\\\/hugging-face-embracing-natural-language-processing\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/submission\\\/hugging-face-embracing-natural-language-processing\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/submission\\\/hugging-face-embracing-natural-language-processing\\\/#primaryimage\",\"url\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2022\\\/04\\\/demo-huggingface_optimized-e1650032899960.png\",\"contentUrl\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/wp-content\\\/uploads\\\/sites\\\/2\\\/2022\\\/04\\\/demo-huggingface_optimized-e1650032899960.png\",\"width\":370,\"height\":114},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/submission\\\/hugging-face-embracing-natural-language-processing\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Submissions\",\"item\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/submission\\\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Hugging Face: Embracing Natural Language Processing\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/#website\",\"url\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/\",\"name\":\"Digital Innovation and Transformation\",\"description\":\"MBA Student Perspectives\",\"potentialAction\":[{\"@type\":\"性视界Action\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/d3.harvard.edu\\\/platform-digit\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Hugging Face: Embracing Natural Language Processing - Digital Innovation and Transformation","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/d3.harvard.edu\/platform-digit\/submission\/hugging-face-embracing-natural-language-processing\/","og_locale":"en_US","og_type":"article","og_title":"Hugging Face: Embracing Natural Language Processing - Digital Innovation and Transformation","og_description":"Learn how the leading provider of large language models does it with a completely open source business model","og_url":"https:\/\/d3.harvard.edu\/platform-digit\/submission\/hugging-face-embracing-natural-language-processing\/","og_site_name":"Digital Innovation and Transformation","article_modified_time":"2022-04-22T23:44:16+00:00","og_image":[{"width":370,"height":114,"url":"https:\/\/d3.harvard.edu\/platform-digit\/wp-content\/uploads\/sites\/2\/2022\/04\/demo-huggingface_optimized-e1650032899960.png","type":"image\/png"}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/d3.harvard.edu\/platform-digit\/submission\/hugging-face-embracing-natural-language-processing\/","url":"https:\/\/d3.harvard.edu\/platform-digit\/submission\/hugging-face-embracing-natural-language-processing\/","name":"Hugging Face: Embracing Natural Language Processing - Digital Innovation and Transformation","isPartOf":{"@id":"https:\/\/d3.harvard.edu\/platform-digit\/#website"},"primaryImageOfPage":{"@id":"https:\/\/d3.harvard.edu\/platform-digit\/submission\/hugging-face-embracing-natural-language-processing\/#primaryimage"},"image":{"@id":"https:\/\/d3.harvard.edu\/platform-digit\/submission\/hugging-face-embracing-natural-language-processing\/#primaryimage"},"thumbnailUrl":"https:\/\/d3.harvard.edu\/platform-digit\/wp-content\/uploads\/sites\/2\/2022\/04\/demo-huggingface_optimized-e1650032899960.png","datePublished":"2022-04-15T14:29:31+00:00","dateModified":"2022-04-22T23:44:16+00:00","breadcrumb":{"@id":"https:\/\/d3.harvard.edu\/platform-digit\/submission\/hugging-face-embracing-natural-language-processing\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/d3.harvard.edu\/platform-digit\/submission\/hugging-face-embracing-natural-language-processing\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/d3.harvard.edu\/platform-digit\/submission\/hugging-face-embracing-natural-language-processing\/#primaryimage","url":"https:\/\/d3.harvard.edu\/platform-digit\/wp-content\/uploads\/sites\/2\/2022\/04\/demo-huggingface_optimized-e1650032899960.png","contentUrl":"https:\/\/d3.harvard.edu\/platform-digit\/wp-content\/uploads\/sites\/2\/2022\/04\/demo-huggingface_optimized-e1650032899960.png","width":370,"height":114},{"@type":"BreadcrumbList","@id":"https:\/\/d3.harvard.edu\/platform-digit\/submission\/hugging-face-embracing-natural-language-processing\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/d3.harvard.edu\/platform-digit\/"},{"@type":"ListItem","position":2,"name":"Submissions","item":"https:\/\/d3.harvard.edu\/platform-digit\/submission\/"},{"@type":"ListItem","position":3,"name":"Hugging Face: Embracing Natural Language Processing"}]},{"@type":"WebSite","@id":"https:\/\/d3.harvard.edu\/platform-digit\/#website","url":"https:\/\/d3.harvard.edu\/platform-digit\/","name":"Digital Innovation and Transformation","description":"MBA Student Perspectives","potentialAction":[{"@type":"性视界Action","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/d3.harvard.edu\/platform-digit\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"}]}},"_links":{"self":[{"href":"https:\/\/d3.harvard.edu\/platform-digit\/wp-json\/wp\/v2\/hck-submission\/15450","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/d3.harvard.edu\/platform-digit\/wp-json\/wp\/v2\/hck-submission"}],"about":[{"href":"https:\/\/d3.harvard.edu\/platform-digit\/wp-json\/wp\/v2\/types\/hck-submission"}],"author":[{"embeddable":true,"href":"https:\/\/d3.harvard.edu\/platform-digit\/wp-json\/wp\/v2\/users\/19172"}],"replies":[{"embeddable":true,"href":"https:\/\/d3.harvard.edu\/platform-digit\/wp-json\/wp\/v2\/comments?post=15450"}],"version-history":[{"count":4,"href":"https:\/\/d3.harvard.edu\/platform-digit\/wp-json\/wp\/v2\/hck-submission\/15450\/revisions"}],"predecessor-version":[{"id":15651,"href":"https:\/\/d3.harvard.edu\/platform-digit\/wp-json\/wp\/v2\/hck-submission\/15450\/revisions\/15651"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/d3.harvard.edu\/platform-digit\/wp-json\/wp\/v2\/media\/15452"}],"wp:attachment":[{"href":"https:\/\/d3.harvard.edu\/platform-digit\/wp-json\/wp\/v2\/media?parent=15450"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/d3.harvard.edu\/platform-digit\/wp-json\/wp\/v2\/categories?post=15450"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}