EEL 4810 INTRODUCTION TO DEEP LEARNING ECE Department

Description

Hello, I need Help please to solve the question with the best answer

Don't use plagiarized sources. Get Your Custom Assignment on
EEL 4810 INTRODUCTION TO DEEP LEARNING ECE Department
From as Little as $13/Page

Unformatted Attachment Preview

EEL 4810
1
HW 2
Due Mar 22
Optimality of l1 regularization
Recall that when l1 regularization is used, and assuming that the Hessian matrix H is
diagonal and positive definite, the objective function can be approximated by
L̂R (θ) =
d
X
1
i=1
2

Hii (θi − θi∗ ) + α|θi |
.
Find the optimal solution of this approximated objective function, and compare it with the
optimal solution of the objective function without regularization.
1
EEL 4810
2
HW 2
Due Mar 22
Backpropagation of fully connected network
Consider the following neural network:
∂f

where h1 = σ(W 1 x), h2 = σ(W 2 h1 ), f (x) = w3 h2 . Compute ∂W
1 . Use σ to denote the
i,j
derivative of the activation function σ.
2
EEL 4810
3
HW 2
Due Mar 22
Feedforward neural network
Design a feedforward neural network to solve XOR problem. The network is required to
have at least 1 hidden layer with 3 neurons.
3
EEL 4810
4
HW 2
Due Mar 22
Convolution
Consider the following convolution of the image X and kernel K:
(1). Calculate the convolution of them.
(2). Calculate the Cross-Correlation of them.
(3). Wide convolution: For the image X ∈ RM ×N and kernel K ∈ Rm×n , zero-padding is
applied to both dimensions of the image X, padding each end with m − 1 and n − 1 zeros,
resulting the full padding image X̃. The convolution of X̃ and K is called wide convolution.
Calculate the wide convolution of X and K.
4
EEL 4810
5
HW 2
Due Mar 22
Universal Approximation Theorem
[Cybenko, 1989, Hornik et al., 1989] showed the following theorem.
Assume ϕ(·) is a bounded, non-decreasing function. For any continuous function
f : [0, 1]d → R and any ϵ > 0, there exists M ∈ N, vi , bi ∈ R and wi ∈ Rd , such that
|f (x) −
M
X
vi ϕ(wi⊤ x + bi )| ≤ ϵ,
(1)
i=1
i.e., any continuous function defined on [0, 1]d can be approximated by the constructed
function class.
(1). Verify that any continuous function defined on [0, 1]d can be approximated by a
neural network with one hidden layer and Sigmoid activation function.
(2). Show that any continuous function defined on [0, 1]d can also be approximated by
a neural network with one hidden layer and ReLU activation function. (Hint. Apply the
theorem above for some ‘approximation’ of ReLU.)
5

Purchase answer to see full
attachment