imported>InternetArchiveBot：补救1个来源，并将0个来源标记为失效。) #IABot (v2.0.9.5

2025-02-04T02:12:06Z

补救1个来源，并将0个来源标记为失效。) #IABot (v2.0.9.5

新页面

{{noteTA|G1=IT}}
[[File:comparison_image_neural_networks.svg|thumb|Comparison of the LeNet and AlexNet convolution, pooling and dense layers]]
'''AlexNet'''是一个[[卷积神经网络]]，由[[亚历克斯·克里泽夫斯基]]设计<ref name=quartz>{{cite web|website=[[石英财经网|Quartz]]|author=Dave Gershgorn|title=The inside story of how AI got good enough to dominate Silicon Valley|url=https://qz.com/1307091/the-inside-story-of-how-ai-got-good-enough-to-dominate-silicon-valley/|date=2018-06-18|accessdate=2018-10-05|archive-date=2019-12-12|archive-url=https://web.archive.org/web/20191212224842/https://qz.com/1307091/the-inside-story-of-how-ai-got-good-enough-to-dominate-silicon-valley/|dead-url=no}}</ref>，与[[伊尔亚·苏茨克维]]和克里泽夫斯基的博士导师[[杰弗里·辛顿]]共同发表<ref name =":1">{{cite web|url=https://qz.com/1034972/the-data-that-changed-the-direction-of-ai-research-and-possibly-the-world/|title=The data that transformed AI research—and possibly the world|accessdate=2020-01-17|archive-date=2017-07-27|archive-url=https://web.archive.org/web/20170727173117/https://qz.com/1034972/the-data-that-changed-the-direction-of-ai-research-and-possibly-the-world/|dead-url=no}}</ref><ref name =":2">{{cite web|url=http://www.image-net.org/challenges/LSVRC/2012/results.html|title=ILSVRC2012 Results|accessdate=2020-01-17|archive-date=2020-01-16|archive-url=https://web.archive.org/web/20200116011309/http://image-net.org/challenges/LSVRC/2012/results.html|dead-url=no}}</ref>。

AlexNet参加了2012年9月30日举行的[[ImageNet]]大规模视觉识别挑战赛<ref name=":0">{{Cite journal|last=Krizhevsky|first=Alex|last2=Sutskever|first2=Ilya|last3=Hinton|first3=Geoffrey E.|date=2017-05-24|title=ImageNet classification with deep convolutional neural networks|url=https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf|journal=Communications of the ACM|volume=60|issue=6|pages=84–90|doi=10.1145/3065386|issn=0001-0782|via=|access-date=2020-01-17|archive-date=2017-05-16|archive-url=https://web.archive.org/web/20170516174757/http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf|dead-url=no}}</ref>，达到最低的15.3%的Top-5错误率，比第二名低10.8个百分点。原论文的主要结论是，模型的深度对于提高性能至关重要，AlexNet的计算成本很高，但因在训练过程中使用了[[图形处理器]]（GPU）而使得计算具有可行性<ref name=":0" /> 。

== 背景 ==
AlexNet并不是卷积神经网络（CNN）第一次利用快速GPU实现而赢得图像识别竞赛。K. Chellapilla等人（2006）在GPU上的CNN比同等的CPU实现速度快4倍<ref>{{cite book|author1=Kumar Chellapilla|author2=Sid Puri|author3=Patrice Simard|editor1-last=Lorette|editor1-first=Guy|title=Tenth International Workshop on Frontiers in Handwriting Recognition|date=2006|publisher=Suvisoft|chapter-url=https://hal.inria.fr/inria-00112631/document|archivedate=2020-05-18|chapter=High Performance Convolutional Neural Networks for Document Processing|access-date=2020-01-17|archive-url=https://web.archive.org/web/20200518193413/https://hal.inria.fr/inria-00112631/document|dead-url=no}}</ref>。Dan Ciresan等人（2011）的深层CNN在[[IDSIA]]上已经快了60倍<ref name="flexible">{{cite journal|last=Ciresan|first=Dan|author2=Ueli Meier|author3=Jonathan Masci|author4=Luca M. Gambardella|author5=Jurgen Schmidhuber|title=Flexible, High Performance Convolutional Neural Networks for Image Classification|journal=Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence-Volume Volume Two|year=2011|volume=2|pages=1237–1242|url=http://www.idsia.ch/~juergen/ijcai2011.pdf|accessdate=2013-11-17|archive-date=2013-11-16|archive-url=https://web.archive.org/web/20131116121919/http://www.idsia.ch/~juergen/ijcai2011.pdf|dead-url=no}}</ref>，并在2011年8月取得了超过人类的表现<ref>{{Cite web|url=http://benchmark.ini.rub.de/?section=gtsrb&subsection=results|title=IJCNN 2011 Competition result table|website=OFFICIAL IJCNN2011 COMPETITION|language=|access-date=2019-01-14|date=2010|archive-date=2019-01-21|archive-url=https://web.archive.org/web/20190121064353/http://benchmark.ini.rub.de/?section=gtsrb&subsection=results|dead-url=no}}</ref>。从2011年5月15日到2012年9月10日，他们的CNN赢得了不少于四场图像竞赛<ref>{{Cite web|url=http://people.idsia.ch/~juergen/computer-vision-contests-won-by-gpu-cnns.html|last1=Schmidhuber|first1=Jürgen|title=History of computer vision contests won by deep CNNs on GPU|language=|access-date=2019-01-14|date=2017-03-17|archive-date=2018-12-19|archive-url=https://web.archive.org/web/20181219224934/http://people.idsia.ch/~juergen/computer-vision-contests-won-by-gpu-cnns.html|dead-url=no}}</ref><ref name="schdeepscholar">{{cite journal|last1=Schmidhuber|first1=Jürgen|title=Deep Learning|journal=Scholarpedia|url=http://www.scholarpedia.org/article/Deep_Learning|date=2015|volume=10|issue=11|pages=1527–54|pmid=16764513|doi=10.1162/neco.2006.18.7.1527|citeseerx=10.1.1.76.1541|access-date=2020-01-17|archive-date=2016-04-19|archive-url=https://web.archive.org/web/20160419024349/http://www.scholarpedia.org/article/Deep_Learning|dead-url=no}}</ref>。他们还极大提高了文献中多个图像[[数据库]]的最佳性能<ref name="mcdns">{{cite book |last1=Ciresan |first1=Dan |first2=Ueli |last2=Meier |first3=Jürgen |last3=Schmidhuber |title=Multi-column deep neural networks for image classification |journal=2012 IEEE Conference on Computer Vision and Pattern Recognition |date=June 2012 |pages=3642–3649 |doi=10.1109/CVPR.2012.6248110 |arxiv=1202.2745 |isbn=978-1-4673-1226-4 |oclc=812295155 |publisher=[[电气电子工程师学会|Institute of Electrical and Electronics Engineers]] (IEEE) |location=New York, NY|citeseerx=10.1.1.300.3283 }}</ref>。

根据AlexNet的论文<ref name=":0" />，其与Ciresan的早期网络“有些相似”。两者最初都用[[CUDA]]编写，可在[[圖形處理器|GPU]]支持下运行。实际上，两者都是[[杨立昆]]等人（1989）介绍的CNN设计的变体<ref name="lecun1">Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, L. D. Jackel, [http://yann.lecun.com/exdb/publis/pdf/lecun-89e.pdf Backpropagation Applied to Handwritten Zip Code Recognition] {{Wayback|url=http://yann.lecun.com/exdb/publis/pdf/lecun-89e.pdf |date=20200110090230 }}; AT&T Bell Laboratories</ref><ref name="lecun98">{{cite journal|last=LeCun|first=Yann|author2=Léon Bottou|author3=Yoshua Bengio|author4=Patrick Haffner|title=Gradient-based learning applied to document recognition|journal=Proceedings of the IEEE|year=1998|volume=86|issue=11|pages=2278–2324|url=http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf|accessdate=2016-10-07|doi=10.1109/5.726791|citeseerx=10.1.1.32.9552|archive-date=2017-12-15|archive-url=https://web.archive.org/web/20171215083109/http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf|dead-url=yes}}</ref>，他将[[反向传播算法]]应用于福岛邦彦（{{lang|jaa|福島邦彦}}）最初提出的CNN架构“[[neocognitron]]”的一个变种<ref name=fukuneoscholar>{{cite journal | last1 = Fukushima | first1 = K. | year = 2007 | title = Neocognitron | url = | journal = Scholarpedia | volume = 2 | issue = 1| page = 1717 | doi=10.4249/scholarpedia.1717}}</ref><ref name="intro">{{cite journal|last=Fukushima|first=Kunihiko|title=Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position|journal=Biological Cybernetics|year=1980|volume=36|issue=4|pages=193–202|url=http://www.cs.princeton.edu/courses/archive/spr08/cos598B/Readings/Fukushima1980.pdf|accessdate=2013-11-16|doi=10.1007/BF00344251|pmid=7370364|archive-date=2014-06-03|archive-url=https://web.archive.org/web/20140603013137/http://www.cs.princeton.edu/courses/archive/spr08/cos598B/Readings/Fukushima1980.pdf|dead-url=no}}</ref>。后来J. Weng提出的[[卷积神经网络|最大池化方法]]修改了该架构<ref name="weng1993">{{cite journal |first1=J |last1=Weng |first2=N |last2=Ahuja |first3=TS |last3=Huang |title=Learning recognition and segmentation of 3-D objects from 2-D images |journal=Proc. 4th International Conf. Computer Vision |year=1993 |pages=121–128 }}</ref><ref name="schdeepscholar" />。

== 网络设计 ==
AlexNet包含八层。前五层是[[卷积]]层，之后一些层是[[卷积神经网络|最大池化]]层，最后三层是全连接层<ref name=":0" />。它使用了非饱和的[[线性整流函数|ReLU]]激活函数，显示出比[[双曲函数|tanh]]和[[S函数|sigmoid]]更好的训练性能<ref name=":0" />。

== 影响 ==
AlexNet被认为是计算机视觉领域最有影响力的论文之一，它刺激了更多使用卷积神经网络和GPU来加速深度学习的论文的出现<ref>{{Cite web|url=https://adeshpande3.github.io/adeshpande3.github.io/The-9-Deep-Learning-Papers-You-Need-To-Know-About.html|title=The 9 Deep Learning Papers You Need To Know About (Understanding CNNs Part 3)|last=Deshpande|first=Adit|website=adeshpande3.github.io|access-date=2018-12-04|archive-date=2018-11-21|archive-url=https://web.archive.org/web/20181121185730/https://adeshpande3.github.io/adeshpande3.github.io/The-9-Deep-Learning-Papers-You-Need-To-Know-About.html|dead-url=no}}</ref>。
根據Google scholar網站統計，截至2024年中，AlexNet论文已被引用超过157,000次<ref>{{Cite web |url=https://scholar.google.com/citations?view_op=view_citation&hl=en&user=xegzhJcAAAAJ&citation_for_view=xegzhJcAAAAJ:u5HHmVD_uO8C |title=AlexNet paper on Google Scholar |access-date=2024-07-02 |archive-date=2023-06-23 |archive-url=https://web.archive.org/web/20230623004408/https://scholar.google.com/citations?view_op=view_citation&hl=en&user=xegzhJcAAAAJ&citation_for_view=xegzhJcAAAAJ:u5HHmVD_uO8C |dead-url=no }}</ref>。

== 亚历克斯·克里泽夫斯基 ==
亚历克斯·克里泽夫斯基（出生于[[乌克兰]]，在[[加拿大]]长大）是一名[[计算机科学|计算机科学家]]，以在[[人工神经网络]]和[[深度学习]]方面的工作而著称。在通过AlexNet赢得ImageNet 2012挑战赛后不久，他和同事将他们的创业公司DNN研究公司（DNN Research Inc.）卖给了[[Google]]<ref name=quartz/>。克里泽夫斯基对这项工作失去兴趣后，于2017年9月离开了Google<ref name=quartz/>。在Dessa公司，克里泽夫斯基将为新的深度学习技术提供建议和帮助<ref name=quartz/>。研究人员经常引用他的许多有关[[机器学习]]和[[计算机视觉]]的论文<ref name=GoogleScholar>{{cite web | title=Alex Krizhevsky | website=Google Scholar Citations | url=https://scholar.google.com/citations?user=xegzhJcAAAAJ | accessdate=2020-01-17 | archive-date=2020-04-17 | archive-url=https://web.archive.org/web/20200417011555/https://scholar.google.com/citations?user=xegzhJcAAAAJ | dead-url=no }}</ref>。

==参考资料==
{{reflist|30em}}

{{Differentiable computing}}
[[Category:神经网络软件]]
[[Category:深度学习]]
[[Category:神經網路架構]]

AlexNet - 版本历史

imported>InternetArchiveBot：​补救1个来源，并将0个来源标记为失效。) #IABot (v2.0.9.5

imported>InternetArchiveBot：补救1个来源，并将0个来源标记为失效。) #IABot (v2.0.9.5