GPT (OpenAI) — LLMpedia

GPT (OpenAI)
AI-generated (Stable Diffusion 3.5) · CC BY 4.0 · source
Name	GPT (OpenAI)
Developer	OpenAI
First release	2018
Latest release	2023
Programming language	Python
Platform	Cloud, API

Contents

Overview
Architecture and Models
Training and Data
Capabilities and Applications
Safety, Ethics, and Policy
Reception and Impact

GPT (OpenAI)

GPT (OpenAI) is a family of large-scale transformer-based language models developed by OpenAI. It produced a sequence of influential releases that affected research and deployment across technology, academia, and industry. The project intersected with initiatives, institutions, and public figures in computing, policy, and media.

Overview

GPT emerged amid activity around transformer architectures pioneered by researchers at Google Research, with influences from work at Facebook AI Research, Microsoft Research, DeepMind, Stanford University, Massachusetts Institute of Technology, Carnegie Mellon University, University of California, Berkeley, University of Toronto, ETH Zurich, University of Oxford, University of Cambridge, Princeton University, Columbia University, Harvard University, Yale University, California Institute of Technology, Imperial College London, University of Washington, University of Montreal, Tsinghua University, Peking University, University of Edinburgh, Cornell University, University of Michigan, University of Illinois Urbana-Champaign, University of California, San Diego, New York University, University of Pennsylvania, Tokyo Institute of Technology, Seoul National University, National University of Singapore, EPFL, University of Sydney, University of Toronto, KTH Royal Institute of Technology, McGill University, Riken, RIKEN Center for Advanced Intelligence Project, Canadian Institute for Advanced Research, Allen Institute for AI, NVIDIA, Intel, AMD, Google DeepMind, OpenAI collaborators and funders shaped the landscape in which GPT developed. The development attracted attention from policymakers including actors in the European Commission, United States Department of Defense, United Kingdom Government, Congress of the United States, White House, United Nations, Organisation for Economic Co-operation and Development, World Economic Forum, G7, G20, European Parliament, and civil society groups such as Electronic Frontier Foundation, American Civil Liberties Union, and OpenAI, as well as coverage in outlets like The New York Times, The Guardian, The Washington Post, BBC News, Reuters, Bloomberg, Financial Times, The Wall Street Journal and commentators including figures from MIT Technology Review, Wired, Nature (journal), Science (journal), arXiv, NeurIPS, ICLR, ACL (Association for Computational Linguistics), ICML, and AAAI.

Architecture and Models

GPT models use the transformer decoder architecture first formalized in work that involved researchers from Google Research and popularized through workshops at NeurIPS and ICLR. Successive model families included iterations that scaled parameters, compute, and dataset size, paralleling efforts at DeepMind with models like Gopher (DeepMind), and initiatives from Anthropic, Cohere, Meta Platforms, IBM Research, Huawei Noah's Ark Lab, Baidu Research, Alibaba DAMO Academy, and Salesforce Research. The models leveraged hardware and software ecosystems supported by companies such as NVIDIA, Intel, AMD, Google Cloud, Microsoft Azure, Amazon Web Services, OpenAI, and frameworks from PyTorch, TensorFlow, JAX, and libraries used at Stanford University and Berkeley AI Research. Research milestones were presented alongside work by investigators affiliated with Yoshua Bengio, Geoffrey Hinton, Yann LeCun, Ilya Sutskever, Andrej Karpathy, Sam Altman, Dario Amodei, and teams collaborating across institutions including Harvard University and Stanford University.

Training and Data

Training of GPT models relied on massive corpora compiled from publicly available sources and licensed datasets, practices that intersected with legal and academic concerns seen in disputes involving publishers such as Elsevier, RELX, Wolters Kluwer, Springer Nature, Penguin Random House, The New York Times Company, and aggregators like Common Crawl. Data governance discussions involved stakeholders including European Commission, United States Copyright Office, Authors Guild, Creative Commons, Internet Archive, Wikipedia, Project Gutenberg, and research consortia at ACL (Association for Computational Linguistics), NeurIPS, ICLR, and ICML. Training employed compute supplied by cloud providers and supercomputing centers linked to Lawrence Berkeley National Laboratory, Argonne National Laboratory, Oak Ridge National Laboratory, Sandia National Laboratories, Los Alamos National Laboratory, and corporate data centers of Microsoft Corporation and Amazon Web Services.

Capabilities and Applications

GPT family models demonstrated capabilities across tasks historically studied in labs at MIT, Stanford University, Carnegie Mellon University, University of Toronto, Harvard University, and in industry groups at Google Research, Microsoft Research, Facebook AI Research, DeepMind, IBM Research, and OpenAI. Applications spanned products and services by companies such as Microsoft Corporation, GitHub, Salesforce, Shopify, Snap Inc., Snapchat, Discord, Reddit, Grammarly, Canva, Duolingo, Khan Academy, Pearson PLC, Udacity, Coursera, Byju's, Zoom Video Communications, and startups incubated at Y Combinator. Use cases included assistance in software development popularized through integrations with Visual Studio Code, GitHub Copilot, creative writing in collaboration with publishers like Penguin Random House, educational tools used by institutions such as Khan Academy and Coursera, and deployments in customer support, healthcare prototypes referenced by groups at Mayo Clinic, Johns Hopkins Medicine, Cleveland Clinic, Mount Sinai Health System, and finance firms including JPMorgan Chase, Goldman Sachs, Morgan Stanley, BlackRock.

Safety, Ethics, and Policy

Safety research around GPT engaged ethicists and policy scholars from Harvard University, Yale University, Stanford University, Oxford University, Cambridge University, Princeton University, Columbia University, ETH Zurich, University of Toronto, and regulatory bodies including European Commission, United States Department of Commerce, Federal Trade Commission, United Kingdom Information Commissioner's Office, World Health Organization, and UNESCO. Debates referenced frameworks and reports from OpenAI, DeepMind, Anthropic, Partnership on AI, Future of Life Institute, AI Now Institute, Center for a New American Security, Brookings Institution, Berkman Klein Center, RAND Corporation, Center for Strategic and International Studies, Carnegie Endowment for International Peace, and legal cases involving Authors Guild and media companies. Topics included model alignment, robustness, misinformation, privacy, and regulatory proposals advocated by legislators in United States Congress and panels at World Economic Forum.

Reception and Impact

GPT drew responses from academic reviewers in Nature (journal), Science (journal), reviewers at NeurIPS and ICLR, technology journalists at The New York Times, The Guardian, Wired, MIT Technology Review, and public commentary from leaders such as Tim Cook, Satya Nadella, Sundar Pichai, Elon Musk, Mark Zuckerberg, Bill Gates, Jeff Bezos, Larry Page, Sergey Brin, Reid Hoffman, Peter Thiel, Yoshua Bengio, Geoffrey Hinton, and Yann LeCun. Economic and labor impacts were analyzed by institutions including International Labour Organization, OECD, World Bank, IMF, Brookings Institution, McKinsey & Company, and Deloitte. Cultural and educational effects engaged communities at Wikipedia, Reddit, Stack Overflow, GitHub, YouTube, Twitter, Meta Platforms (formerly Facebook) and inspired regulatory scrutiny from European Commission and national legislatures.

Category:Artificial intelligence