Breadcrumb

Latest News

A Computational ODE Model for the Evaluation of Immune System Pathway Dynamics in Homeostasis and Disease

The complement system is a part of innate immunity that rapidly removes invading pathogens and impaired host-cells. Activation of the complement system is balanced under homeostasis by regulators that protect healthy host-cells. Impairment of complement regulators tilts the balance, favoring activation and propagation that leads to inflammatory and autoimmune diseases. To understand the dynamics of...

Asymmetric AdaBoost for High Dimensional Maximum Score Regression

Adaptive Boosting or AdaBoost, introduced by Freund and Schapire (1996) has been proved to be effective to solve the high-dimensional binary classification or binary prediction problems. Friedman, Hastie, and Tibshirani (2000) show that AdaBoost builds an additive logistic regression model via minimizing the ‘exponential loss’. We show that the exponential loss in AdaBoost is equivalent...

Power Attacks in Multi-Tenant Data Centers: Threat and Defense

The explosion of Internet of Things and cloud computing applications has generated a huge demand for multi-tenant collocation data centers everywhere, extending the Internet edge beyond the traditional hub locations. As one would expect, securing datacenters against cyber attacks is extremely important, and so is providing a reliable power supply to servers. While the threat...

Continuous Visual Learning with Limited Supervision by Exploiting Context

It is well known that relationships between data points (i.e., context) in structured data can be exploited to obtain better recognition performance. In our recent work, we have explored a different, but related, problem: how can these inter relationships be used to efficiently learn and continuously update a recognition model, with minimal human labeling effort...

A Bird's-Eye View on Microblogs Data Management and Analysis

Microblogs data, e.g., tweets, reviews, news comments, and social media comments, has gained considerable attention in recent years due to its popularity and rich contents. Nowadays, microblogs applications span a wide spectrum of interests, including analyzing events and users activities and critical applications like discovering health issues and rescue services. Consequently, major research efforts are...

Towards Improved Hydrologic Prediction by Merging Data with Models

Increases in greenhouse gas concentrations are expected to impact the terrestrial hydrologic cycle through changes in radiative forcings and plant physiological and structural responses. As a result, projections of future changes in water resources become complicated due to the tight coupling between the biosphere and terrestrial hydrologic cycle. In recent years a number of physically...

Environmental Sensing Data for Assessing the Role of Vegetation in Urban Water and Climate Sustainability

Environmental sensing has expanded rapidly for more than a decade. I will provide an overview of the dimensions of this data revolution within the ecological sciences. I will then describe a specific evaluation of the water-ecosystem service trade-offs for the use of urban vegetation to cool cities. Vegetation interacts strongly with urban water sustainability. Plants...

Placing Lobbyists in Legislative Ideological Space

I propose a new method to place lobbyists into standard common space measures for ideology scores, leveraging responses from former members of the U.S. Congress to a survey containing a battery of ideology attitude measures, along with a flexible Bayesian statistical model. The statistical model incorporates estimation uncertainty into the imputed lobbyist ideology measures and...

Measuring Trade Profiles with Two Billion Observations of Product Trade

The product composition of bilateral trade encapsulates complex relationships about comparative advantage, global production networks, and domestic politics. Yet, despite the availability of product-level trade data, most researchers rely on either the total volume of trade or certain sets of aggregated products. We develop a new dynamic clustering method to effectively summarize this massive amount...

Computational analysis of olfaction

Insects use the sense of smell to identify their host animals and plants. The ability to detect and discriminate thousands of odorants from their hosts uses a very large family of transmembrane odorant receptors and complex neuronal circuitry. The study of olfaction has benefited from computational approaches to identify important principles: protein sequences of receptors...

Time Series Data Mining Using the Matrix Profile: A Unified View of Motif Discovery, Anomaly, Detection, Segmentation, Classification, Clustering and Similarity Joins

Time series data mining is a perennially popular research topic, due to the ubiquity of time series in medical, financial, industrial, and scientific domains. There are about a dozen major time series data mining tasks, including: • Time Series Motif Discovery • Time Series Joins • Time Series Classification (shapelet discovery) • Time Series Density...

Computational Imaging: Finding Structure from Randomness

We face two broad challenges as we design the next generation of intelligent and interconnected devices: On one extreme, these systems will collect an enormous amount of data from a multitude of sources and require low-complexity, versatile algorithms that can make sense of all the data. On the other extreme, certain physical or system constraints...

You Got Data? We Got Tensors!!

Tensors and tensor decompositions have been very popular and effective tools for analyzing multi-aspect data in a wide variety of fields, ranging from Psychology to Chemometrics, and from Signal Processing to Data Mining and Machine Learning. In this talk, first I will motivate the use of tensors as an effective data analytic tool in a...

Big Spatial Data on Hadoop and Beyond

The explosion in the amount of spatial data in the recent years urged researchers to build specialized systems for big spatial data. This talk will have two parts. In the first part, we describe SpatialHadoop, the most comprehensive open source system for big spatial data. We describe how SpatialHadoop managed to achieve simplicity and efficiency...

How big data and computational models are changing the study of child language acquisition

For decades, cognitive psychologists and linguists have studied language development by testing theories of learning and development in highly controlled behavioral experiments. Much has been learned from this approach. However, Big Data and computational models allow us to investigate language development in a radically different way: by collecting large datasets of actual speech to children...

Big data opportunities in computational materials science

The modern electronic structure methods allow for a reasonably accurate computation of quantum mechanical behavior of atoms and electrons in a material with almost any chemical composition, using virtually no input parameters. These methods allow design of new materials even before they are made in the laboratory. The only input parameters for these methods are...

Multiple dimensions of epigenetic gene regulation in malaria parasites -Transcriptional regulation via histone modifications, nucleosome positioning and nuclear architecture.

My lab investigates new high throughput functional genomics methods to study gene regulation in the major human malaria parasite species (P. falciparum and P. vivax) and the zoonotic species, P. knowlesi. In particular, my laboratory generates large genome-wide data sets using next generation sequencing and proteomics technologies along with novel computational biology approaches to better...

Microbial eukaryotes in marine sediments: linking molecules with morphology in the -Omics age

Microbial eukaryotes (organisms <1mm, such as nematodes, fungi, protists, and other ‘minor’ metazoan phyla) are abundant and ubiquitous in marine sediments, performing key functions such as nutrient cycling and sediment stability in marine habitats. Yet, their unexplored diversity represents one of the major challenges in biology and currently limits our capacity to understand, mitigate and...