Home > Research > DS Group > Overview

The system consists of a big data analytic platform ELK (ElasticSearch, Logstash, Kibana) which can collect data, parse data, search data easily. The data is collected either from logs from network devices, metadata calculated from SDN or NFV. The automatic network defend system is to then utilize techniques such as machine learning and deep learning to extract knowledge from the collected network data.

Many IoT applications involving machine-to-machine (M2M) communications are characterized by the large amount of data to transport and the limited amount of communication resource to use. To address the communication problem with ¡§big data¡¨ in these IoT applications, we propose ¡§data-centric¡¨ M2M communications that shift the communication paradigm from individual machines to data itself. In other words, instead of focusing on serving individual machines with better quality, data-centric communications focus on solutions that can better serve the data itself. Through various substantiations in different layers of the network protocol stack such as resource allocation, transmission scheduling, and network clustering, we have shown that compared to conventional ¡§machine-centric¡¨ communications, "data-centric" communications can exhibit significant performance gain both in theoretical formulation and practical application.

In the era of big data, data analysts extract useful information from large-scale data sets to complete certain data analysis tasks. Typically data sets can only be accessed through a curator with limited and privacy-preserving queries. How to design efficient privacy-preserving querying methods jointly with data analysis algorithms is a critical engineering challenge. To resolve the challenge, we investigate the fundamental limits for the above setting and develop efficient algorithms that approach the limit.

Our research of social networks is focused on the characterization of fundamental limits and the development of efficient algorithms for inferring hidden attributes in social networks. There are two kinds of methods considered. The first is to make passive observations from the social networks, harnessing user interaction such as friendship, subscription, etc. The second is to carry out active sensing on the users such as anonymous polls.

The long-term goal of Cognitive NeuroRobotics is to develop a dialog robot based on cognitive neuroscience and bio-inspired and physiologically plausible algorithms. Cognitive NeuroRobotics is integrating past neuroscience studies on the auditory as well as the spatial attention system, and simulating the auditory process by Nengo 2.0 (Neural ENGineering Objects) for Chinese speech signals along the external, the medial, and the inner ears, together with the auditory pathway (major areas for determine the location of the sound source). The input signal is also segmented in this process into words and phrases for subsequent dialog analysis and response generation.

The current study is different from traditional Automatic Speech Recognition (ASR) and Computational Auditory Scene Analysis (CASA) in that our methods are constrained by cognitive neuroscience and the Neural Engineering Framework (NEF). It can also produce a multi-dimension vector in the Semantic Pointer Architecture (SPA) to be used in quasi-symbolic linguistic processes. Besides, our models can generate spikes to be compared with those measured in neurobiological experiments.

Currently, two research topics regarding Cognitive NeuroRobotics will be initiated. The topics include letting the robot construct knowledges about the environment space through dialogs with human or other robots, and speeding up the execution of the Nengo system via an emulator of the IC chip SpiNNaker. We hope that the robots will be able to compute dialog responses in real time by running programs on SpiNNaker.

With the popularity of shared videos, social networks, online courses, etc, the quantity of multimedia or spoken content is growing much faster beyond what human beings can view or listen to. With deep learning techniques, we are working on making machines understand spoken content and extract key information for humans. Below, I am going to highlight three achievements in 2016. To test the capability of listening comprehension of machine, we let machine answer the questions in TOEFL listening comprehension test, and it can answer about half of the questions correctly. In addition, we used deep reinforcement learning to determine the machine actions for interactive spoken content retrieval. Finally, machine can learn human language from audio stories without any supervision.