Email: deigen@cs.nyu.edu
In this project, we explore two ways of customizing nearest-neighbor results for individual queries in the context of a kNN image classifier. First, we learn per-descriptor weights that minimize classification error, using backprop through the NN lookups and calculations. Second, we adapt the training set used for each query based on image context; in particular, we condition on common classes (which are relatively easy to classify) to improve performance on rare ones. The first technique helps to remove extraneous descriptors that result from the imperfect distance metrics/representations of the data. The second contribution re-balances the class frequencies, away from the highly-skewed distribution found in real-world scenes.
Created a system to classify IP addresses as likely spam or ham senders for email based on recent trap rates, using as input live streams of spam trap hits and overall mail volume estimates.
Created a system to automatically classify web page content into 30 categories, based on Bayesian classification methods. The system has categorized over 10 million sites with an estimated misclassification rate of under 10% at 50% recall.
Developed a system that rates HTTP requests with a score indicating the chance the request might fetch malicious content. The system combines multiple sources of data consisting of URL portions or IP ranges; each source may contribute positively or negatively. Reputation is used on web proxy devices to block potentially malicious requests, as well as divert from further scanning traffic that is highly likely to be clean.
Developed systems to process traffic samples sent back from web proxy appliances. IronPort has several thousand appliances deployed at customer sites throughout the world, forming a sensor network of web-related data. This data is automatically fed back into Web Reputation and other systems. We also use it to evaluate efficacy and measure new techniques.
Extracted static client code and dependencies into a dynamically updateable package, taking into account potential differences in system libraries and hardware architecture between client platform releases. Distinct from system upgrades, the engine is automatically updated live on the appliance to the current provisioned version, without any input or interaction by the administrator.
The NVLog is an intent journal used in WAFL (Write-Anywhere File Layout) to ensure data integrity in the event of a system crash or abrupt shutdown. Since all writes are logged to the journal, this part of the system was a serialization point and bottleneck. I rewrote the way journal writes are done to nearly eliminate lock contention, leading to an overall system performance gain of over 10% by throughput.
Outlined a design for dynamic microcores, a reporting and debugging feature. Microcores are partial coredump files, only a few megabytes. The project aimed to let engineers write descriptive recipes to identify memory regions, and trigger microcore generation upon hitting system events. For example, if a system message warns about possible corruption in a block, one could identify interesting memory regions relative to the in-core structures for the block and inode in the message.
Posed and investigated the question of whether the use of different interaction techniques might impact the memory of a user. I designed a set of two experiments to address this: The first, a preliminary study confirming a well-known difference in performance between positional- and velocity-based controls, helped to verify my experimental methodology. The second, a comparison between three interaction modes in an immersive environment, was statistically inconclusive. Anecdotal evidence, however, suggested that for many subjects, memory performance improved with a full-body walking interaction.
Wrote and reviewed proposals for a 6-week project in a mock grant proposal process during a class on scientific visualization. Carried out research on my project on electrode parameter settings in deep brain stimulation for obsessive compulsive disorder, collaborating with Benjamin Greenberg, a psychiatrist at Butler Hospital. Presented a poster of this project in SIGGRAPH 2004.
Created a software package for creating interactive differential geometry visualizations, and produced class labs and demonstrations using this software. Staff and students continue to use this package in new applications and to explore mathematical concepts in several courses at Brown, including differential geometry, combinatorial topology, calculus, geometry, and linear algebra. It has also been used by Prof. Banchoff in classes at UCLA, Notre Dame, University of Georgia, and (in 2010) Stanford.