Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Page Not Found

Page not found. Your pixels are in another canvas.

Jupyter notebook markdown generator

Posts

文明5首次过神

less than 1 minute read

Published: January 03, 2023

记录第一次过神的时刻

PointNeRF的参数配置

less than 1 minute read

Published: January 03, 2023

PointNeRF的参数实在太多，所以制作了一个表格来帮忙记住

安装PointNeRF相关环境

2 minute read

Published: January 03, 2023

以前安装过一次PointNeRF的环境，但是这次在自己的服务器上又要安装一遍，记忆有点模糊了，所以决定以后的实验一定都要做一个记录。感觉这种重复做轮子的事情很麻烦，所以记录下来，第三次安装环境的时候可以查看

Prepare the training dataset for PT-CS-Translator

3 minute read

Published: October 21, 2021

In this blog, I will introduce how I prepare a website for collecting audio datasets from Chaoshanese.

如果你会说潮汕话，请帮助我翻译训练数据集。 By the way, if you can speak ChaoshanHua, help us by providing your translation.

How to translate ChaoShanHua to PuTongHua?

3 minute read

Published: October 01, 2021

One of my goals to learn computer science is to implement a translator that can translate PutongHua to ChaoshanHua. In the project of my C++ course, when I was a freshman, I implemented a prototype that translates PutongHua to characters by Xunfei API and then converted the characters to ChaoshanHua according to the recorded audio of each word (Lol, how naïve it was). It's far from satisfactory because our program doesn't consider the semantics of the language! Due to its complexity, I postponed this project and continued my studies in computer science. Thanks to the deep learning I learned under the supervision of my advisor Professor Si Wu, I am reconsidering this project with serious commitment!

portfolio

Portfolio item number 1

Short description of portfolio item number 1

Portfolio item number 2

Short description of portfolio item number 2

publications

Improving Representation Learning in Autoencoders via Multidimensional Interpolation and Dual Regularizations

Published in IJCAI, 2019

Autoencoders enjoy a remarkable ability to learn data representations. Research on autoencoders shows that the effectiveness of data interpolation can reflect the performance of representation learning. However, existing interpolation methods in autoencoders do not have enough capability of traversing a possible region between datapoints on a data manifold, and the distribution of interpolated latent representations is not considered. To address these issues, we aim to fully exert the potential of data interpolation and further improve representation learning in autoencoders. Specifically, we propose a multidimensional interpolation approach to increase the capability of data interpolation by setting random interpolation coefficients for each dimension of the latent representations. In addition, we regularize autoencoders in both the latent and data spaces, by imposing a prior on the latent representations in the Maximum Mean Discrepancy (MMD) framework and encouraging generated datapoints to be realistic in the Generative Adversarial Network (GAN) framework. Compared to representative models, our proposed approach has empirically shown that representation learning exhibits better performance on downstream tasks on multiple benchmarks.

High Fidelity GAN Inversion via Prior Multi-Subspace Feature Composition

Published in AAAI, 2021

Check out if you are interested in following questions:

How to analyze pre-trained generator's knowledge?
How to find a target image in the pre-trained generator?
How to manipulate your images?

Generative Adversarial Networks (GANs) have shown impressive gains in image synthesis. GAN inversion was recently studied to understand and utilize the knowledge it learns, where a real image is inverted back to a latent code and can thus be reconstructed by the generator. Although increasing the number of latent codes can improve inversion quality to a certain extent, we find that important details may still be neglected when performing feature composition over all the intermediate feature channels. To address this issue, we propose a Prior multi-Subspace Feature Composition (PmSFC) approach for high-fidelity inversion. Considering that the intermediate features are highly correlated with each other, we incorporate a self-expressive layer in the generator to discovermeaningful subspaces. In this case, the features at a channel can be expressed as a linear combination of those at other channels in the same subspace. We perform feature composition separately in the subspaces. The semantic differences between them benefit the inversion quality, since the inversion process is regularized based on different aspects of semantics. In the experiments, the superior performance of PmSFC demonstrates the effectiveness of prior subspaces in facilitating GAN inversion together with extended applications in visual manipulation.

Adversarial Adaptive Interpolation for Regularizing Representation Learning and Image Synthesis in Autoencoders

Published in ICME, 2021

A new interpolation method that follows the data manifold.

Data interpolation is typically used to explore and understand the latent representation learnt by a deep network. Naive linear interpolation may induce mismatch between the interpolated data and the underlying manifold of the original data. In this paper, we propose an Adversarial Adaptive Interpolation (AdvAI) approach for facilitating representation learning and image synthesis in autoencoders. To determine an interpolation path that stays on the manifold, we incorprate an interpolation correction module, which learns to offset the deviation from the manifold. Further, we perform matching with a prior distribution to control the characteristics of the representation. The data synthesized from random codes along with interpolation-based regularization are in turn used to constrain the representation learning process. In the experiments, the superior performance of the proposed approach demonstrates the effectiveness of AdvAI and associated regularizers in a variety of downstream tasks.

Discovering Density-Preserving Latent Space Walks in GANs for Semantic Image Transformations

Published in ACM Multimedia, 2021

A new traversing method that preserves the latent probability density; Excavate useful directions from the pretrained generator.

Generative adversarial network (GAN)-based models possess superior capability of high-fidelity image synthesis. There are a wide range of semantically meaningful directions in the latent representation space of well-trained GANs, and the corresponding latent space walks are meaningful for semantic controllability in the synthesized images. To explore the underlying organization of a latent space, we propose an unsupervised Density-Preserving Latent Semantics Exploration model (DP-LaSE). The important latent directions are determined by maximizing the variations in intermediate features, while the correlation between the directions is minimized. Considering that latent codes are sampled from a prior distribution, we adopt a density-preserving regularization approach to ensure latent space walks are maintained in iso-density regions, since moving to a higher/lower density region tends to cause unexpected transformations. To further refine semantics-specific transformations, we perform subspace learning over intermediate feature channels, such that the transformations are limited to the most relevant subspaces. Extensive experiments on a variety of benchmark datasets demonstrate that DP-LaSE is able to discover interpretable latent space walks, and specific properties of synthesized images can thus be precisely controlled.