Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...

Details

Autor(en) / Beteiligte
Titel
Deep Signal Compression with Feature Representation Learning
Ort / Verlag
ProQuest Dissertations & Theses
Erscheinungsjahr
2024
Link zum Volltext
Quelle
ProQuest Dissertations & Theses A&I
Beschreibungen/Notizen
  • Deep learning-based lossy signal compression methods have achieved substantial progress and significantly enriched signal compression methodologies in recent years. There are two major aspects that signal source coding can benefit from learned methods. Firstly, the data-intense nature of deep signal compression methods allows a good capture of the probabilistic distribution of feature representations, which leads to efficient entropy coding with proper modeling. Secondly, neural network architectures can provide powerful solutions to feature extraction and representation learning, therefore enabling the elimination of spatial and temporal redundancies by mapping the raw signal to compacter feature domains. This thesis presents four related works addressing the compression problem of different data formats, including speech audio, image, video, and point cloud. The first work introduces a unified compression method that uses generative adversarial networks (GAN) to compress speech audio and images. The compressed signal is represented by a latent vector fed into a generator network, which is trained to produce high-quality signals that minimize a target objective function. The alternating direction method of multipliers (ADMM) based non-uniform quantization is incorporated to effectively discretize the resulting latent vectors. The second work presents a deep video coding framework that predicts and compresses video sequences in the latent vector space. The proposed method first learns the efficient feature domain representation of each video frame and then performs inter-frame prediction in that lower-dimensional space. To exploit the temporal correlation among the feature space frames, it employs a convolutional long short-term memory (ConvLSTM) based network to predict the representation of the future frame. The transmitted bitstream is obtained by quantizing and entropy encoding the feature space residual. The application of the proposed video prediction scheme is studied in the anomaly detection task. The third work aims to address the motion pattern adaptability issue that widely exists in video codecs by a block wise mode ensemble deep video compression framework. It selects the optimal mode for feature domain prediction adapting to different motion patterns. Proposed multi-modes include ConvLSTM-based feature domain prediction, optical flow conditioned feature domain prediction, and feature propagation to address a wide range of cases from static scenes without apparent motions to dynamic scenes with a moving camera. Guided by a binary density map, dense and sparse post-quantization residual blocks are coded in separate entropy coding schemes. On top of that, applying optional run-length coding to sparse residuals can further improve the compression rate. The last work focuses on exploring methods to compress light detection and ranging (LiDAR) data, which extends the study of deep signal compression problems from 2D to 3D domain. LiDAR sensors are widely adopted in a number of applications in the autonomous navigation, virtual reality (VR), and augmented reality (AR) industries, where communication bandwidth is one of the top concerns but understudied. With point clouds and range images being two interchangeable LiDAR data representations, a hybrid framework is introduced to take the best of both worlds. The proposed pipeline mostly relies on a prediction-based approach to exploit spatial and temporal correlations in range images, while providing an octree-based path as an important fallback in certain cases to preserve the reconstruction quality. A content adaptive point cloud sampling technique is also introduced to promote extra compression gains while proving to have minimal impact on machine perceptual tasks.
Sprache
Englisch
Identifikatoren
ISBN: 9798382739212
Titel-ID: cdi_proquest_journals_3064616509

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX