Enroll in CSCE 496 during spring 3-week session | Announce

Enroll in CSCE 496 during the spring 3-week session.

If you're searching for a course to take during the spring 3-week session, consider enrolling in CSCE 496: The “Black” Art & “Elusive” Science of Training Deep Neural Networks.

CSCE 496: The “Black” Art & “Elusive” Science of Training Deep Neural Networks

Deep Learning (DL) is an exciting venture of modern Artificial Intelligence for solving hard problems in computer vision, natural language processing, speech recognition, to name a few, that cross-cuts diverse disciplines. A Deep Neural Network (DNN) is a gigantic conglomeration of computational units or artificial neurons, structured in many successive layers. DNNs discover hidden patterns from the input data by creating layers of increasingly meaningful representations of the data. To unleash the full potential of DNNs, one must have knowledge about effective DNN architectures, learning algorithms, and optimization strategies. Training DNNs is tricky, no less than summoning a “genie”. Many consider it as “black art”. This course will demystify the training process of DNN models that includes the modern state-of-the-art DNN architectures. It will teach the “black” art and “elusive” science of training DNNs. The course adopts the philosophy of learning concepts at the very moment that they are needed to accomplish some practical end. No previous background in Machine Learning is required.

Instructor: M. R. Hasan

Tentative topics:
- Week 1: Intro to Deep Learning, Linear Neural Networks, Perceptron, Multi-Layer Perceptron, Backpropagation algorithm, Training MLP; Deep Neural Networks (DNNs), challenges of training DNNs.
- Week 2: Art & science of training DNNs, algorithmic tricks: activation functions, weight-initialization techniques, and optimization approaches; architectural tricks: batch normalization, gradient clipping; DNNs for Computer Vision: Intro to Convolutional Neural Networks (CNNs), training CNNs effectively and efficiently, visualizing what CNNs learn, creating generalizable representations of data via transfer learning
- Week 3: Modern CNN architectures: AlexNet, VGG, GoogleNet, ResNet, DenseNet; DNNs for Natural Language Processing (NLP): Intro to Recurrent Neural Networks (RNNs), modern RNN architectures: GRU, LSTM, etc.

Bits & Bytes Wed. Dec. 09, 2020