520 | |
▼a The past few decades have seen a rapid expansion of our digital world. While early dwellers of the Internet exchanged simple text messages via email, modern citizens of the digital world conduct a much richer set of activities online: entertainment, banking, booking for restaurants and hotels, just to name a few. In our digitally enriched lives, we not only enjoy great convenience and efficiency, but also leave behind massive amounts of data that offer ample opportunities for improving these digital services, and creating new ones. Meanwhile, technical advancements have facilitated the emergence of new sensors and networks, that can measure, exchange and log data about real world events. These technologies have been applied to many different scenarios, including environmental monitoring, advanced manufacturing, healthcare, and scientific research in physics, chemistry, bio-technology and social science, to name a few. Leveraging the abundant data, learning-based and data-driven methods have become a dominating paradigm across different areas, with data analytics driving many of the recent developments.However, the massive amount of data also bring considerable challenges for analytics. Among them, the collected data are often high-dimensional, with the true knowledge and signal of interest hidden underneath. It is of great importance to reduce data dimension, and transform the data into the right space. In some cases, the data are generated from certain generative models that are identifiable, making it possible to reduce the data back to the original space. In addition, we are often interested in performing some analysis on the data after dimensionality reduction (DR), and it would be helpful to be mindful about these subsequent analysis steps when performing DR, as latent structures can serve as a valuable prior. Based on this reasoning, we develop two methods, one for the linear generative model case, and the other one for the nonlinear case. In a related setting, we study parameter estimation under unknown nonlinear distortion. In this case, the unknown nonlinearity in measurements poses a severe challenge. In practice, various mechanisms can introduce nonlinearity in the measured data. To combat this challenge, we put forth a nonlinear mixture model, which is well-grounded in real world applications. We show that this model is in fact identifiable up to some trivial indeterminancy. We develop an efficient algorithm to recover latent parameters of this model, and confirm the effectiveness of our theory and algorithm via numerical experiments. |