Generated by GPT-5-mini| GoogLeNet | |
|---|---|
| Name | GoogLeNet |
| Developer | |
| Introduced | 2014 |
| Authors | Christian Szegedy, Wei Liu (computer scientist), Yangqing Jia, Pierre Sermanet, Scott Reed (computer scientist), Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich |
| Architecture | Convolutional Neural Network |
| Dataset | ImageNet |
| Notable | Inception module, winner of ILSVRC 2014 |
GoogLeNet is a deep convolutional neural network introduced by researchers at Google as the winner of the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2014. It popularized the Inception module and demonstrated efficient depth and width scaling for image classification, object detection, and transfer learning tasks. The model influenced subsequent architectures such as ResNet, Inception-v2, Inception-v3, Inception-ResNet-v2, and inspired industrial applications at Google Research, DeepMind, OpenAI, and other organizations.
GoogLeNet was published by a team including Christian Szegedy, Wei Liu (computer scientist), Yangqing Jia, Pierre Sermanet, Scott Reed (computer scientist), Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich while affiliated with Google. The model competed in ILSVRC 2014 and achieved state-of-the-art top-5 classification accuracy on ImageNet with a novel, computationally efficient design. Its name references the Google research group and the network architecture introduced in the paper, which emphasized multi-scale feature extraction and parameter efficiency compared with earlier deep nets like AlexNet, VGG (company), and ZFNet.
GoogLeNet introduced the Inception module, combining parallel convolutional filters of different receptive fields (1×1, 3×3, 5×5) and pooling paths within a single block, enabling simultaneous multi-scale processing inspired by ideas from Szegedy (disambiguation), biological vision studies and prior work in computer vision. The architecture stacks multiple Inception modules, interleaved with max-pooling layers, and uses 1×1 convolutions for dimensionality reduction influenced by Network-in-Network concepts and by techniques explored in publications at Stanford University, MIT, and Carnegie Mellon University. GoogLeNet is 22 layers deep (when counting only layers with parameters) and replaces fully connected layers with global average pooling, reducing parameter count compared with AlexNet and VGGNet. Auxiliary classifiers connected to intermediate Inception blocks provided gradient signal during training, an idea linked to research at University of California, Berkeley and optimization insights from Yann LeCun, Geoffrey Hinton, and teams at Microsoft Research.
Training protocols for GoogLeNet used large-scale supervised learning on ImageNet's 1.2 million-image training set, leveraging stochastic gradient descent with momentum, weight decay, and data augmentation strategies seen in prior work from University of Toronto, Oxford University, and ETH Zurich. Implementation leveraged distributed training on GPU clusters from NVIDIA and infrastructure at Google Cloud Platform, with frameworks and toolchains influenced by Caffe, TensorFlow, and earlier deep learning libraries developed at University of Montreal. Hyperparameter tuning, learning-rate schedules, and use of auxiliary classifiers reflected best practices promoted by groups at Facebook AI Research, DeepMind, and academic labs such as Princeton University and University of California, San Diego.
GoogLeNet achieved top-5 error rates in ILSVRC 2014 competitive with and superior to contemporary models, improving on parameter efficiency compared with AlexNet and VGGNet. Evaluation used standard metrics such as top-1 and top-5 accuracy on the ImageNet validation and test sets, and performance was validated across object recognition benchmarks that included subsets curated by teams at Stanford University and University of Massachusetts Amherst. The model's computational profile—measured in FLOPs and memory—made it favorable for deployment in production systems at Google services, influencing adoption in industrial image pipelines at companies like Microsoft, Amazon (company), Baidu, and Alibaba Group.
Subsequent work produced several Inception-family variants including Inception-v2, Inception-v3, and Inception-ResNet-v2, integrating batch normalization and residual connections inspired by ResNet from Microsoft Research Asia. Other extensions combined Inception modules with architectures from MobileNet, DenseNet, and techniques from SqueezeNet to obtain lightweight models suitable for mobile and embedded devices developed by Qualcomm, Apple Inc., and Samsung Electronics. Transfer learning and fine-tuning experiments adapted GoogLeNet-derived features for tasks in medical imaging researched at Johns Hopkins University, autonomous driving studied at Tesla, Inc. and Waymo, and scene understanding projects from Carnegie Mellon University and Toyota Research Institute.
GoogLeNet's Inception module reshaped research agendas at institutions such as Google Research, DeepMind, Facebook AI Research, Stanford University, MIT, and University of Toronto, influencing architectures that balanced depth, width, and computational cost. The model contributed to advances in object detection pipelines like R-CNN, Fast R-CNN, Faster R-CNN, and influenced end-to-end systems in robotics at MIT Computer Science and Artificial Intelligence Laboratory and product teams at Google and Apple Inc.. In education and industry, GoogLeNet's design principles are taught in courses at Massachusetts Institute of Technology, Stanford University, and Carnegie Mellon University and continue to inform research in efficient model design pursued at OpenAI, DeepMind, and university labs worldwide. Category:Convolutional neural networks