Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 23 von 176
Pattern recognition, 2023-04, Vol.136, Article 109206
2023

Details

Autor(en) / Beteiligte
Titel
Training Compact DNNs with ℓ1/2 Regularization
Ist Teil von
  • Pattern recognition, 2023-04, Vol.136, Article 109206
Ort / Verlag
Elsevier Ltd
Erscheinungsjahr
2023
Link zum Volltext
Quelle
Alma/SFX Local Collection
Beschreibungen/Notizen
  • •We propose a network compression model based on ℓ1/2 regularization. To the best of our knowledge, it is the first work utilizing non-Lipschitz continuous regularization to compress DNNs.•We strictly prove the correspondence between ℓp(0<p<1) and Hyper-Laplacian prior. Based on this prior, we suggest utilizing ℓ1/2, as the single regularizer, to sparsify the connections and neurons of the network simultaneously.•We give a closed-form, threshold solution to the proximal operator of ℓ1/2, and consequently design a stochastic proximal gradient algorithm to train the resulting model.•We conduct experiments to validate the performance of the proposed method. The results demonstrate that our method outperforms benchmark methods in terms of accuracy, computation and memory costs. Deep neural network(DNN) has achieved unprecedented success in many fields. However, its large model parameters which bring a great burden on storage and calculation hinder the development and application of DNNs. It is worthy of compressing the model to reduce the complexity of the DNN. Sparsity-inducing regularizer is one of the most common tools for compression. In this paper, we propose utilizing the ℓ1/2 quasi-norm to zero out weights of neural networks and compressing the networks automatically during the learning process. To our knowledge, it is the first work applying the non-Lipschitz continuous regularizer for the compression of DNNs. The resulting sparse optimization problem is solved by stochastic proximal gradient algorithm. For further convenience of calculation, an approximation of the threshold-form solution to the proximal operator with ℓ1/2 is given at the same time. Extensive experiments with various datasets and baselines demonstrate the advantages of our new method.
Sprache
Englisch
Identifikatoren
ISSN: 0031-3203
eISSN: 1873-5142
DOI: 10.1016/j.patcog.2022.109206
Titel-ID: cdi_elsevier_sciencedirect_doi_10_1016_j_patcog_2022_109206

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX