Past HCI Colloquia
From Particle Stereo to Scene Stereo
Carsten Rother, Microsoft Cambridge
In this talk I will present two lines of research which are both
applied to the problem of stereo matching. The first line of research
tries to make progress on the very traditional problem of stereo
matching. In BMVC 11 we presented the PatchmatchStereo work which achieves surprisingly good results with a simple energy function consisting of unary terms only. As optimization
engine we used the PatchMatch method, which was designed for image
editing purposes. In BMVC 12 we extended this work by adding to the
energy function the standard pairwise smoothness terms. The main
contribution of this work is the optimization technique, which we call
PatchMatch-BeliefPropagation (PMBP). It is a special case of
max-product Particle Belief Propagation, with a new sampling schema
motivated by Patchmatch. The method may be suitable for many energy
minimization problems in computer vision, which have a non-convex,
continuous and potentially high-dimensional label space.
The second line of research combines the problem of stereo matching
with the problem of object extracting in the scene. We show that both
tasks can be solved jointly and boost the performance of each individual task.
In particular, stereo matching improves since objects have to obey
physical properties, e.g. they are not allowed to fly in the air.
Object extracting improves, as expected, since we have additional
information about depth in the scene.
Machines Reading the Data
Kristian Kersting, Fraunhofer IAIS, Bonn, Germany
The time is ripe for the AI community to set its sights on machines reading
data, the combination of information and feature extraction,
learning, and reasoning to draw conclusions about implicitly given knowledge. This "understanding of data" --- the formation of a coherent setof beliefs based on data and a declarative background theory --- is a long-standing goal of AI since it holds the promise of revolutionizing web search, robotics, computational sustainability
and other fields. Although much has been achieved already, yet much remains to be done if we are to reach this grand goal. This talk will
examine some of what we have recently understood, as a means of identifying what might be understood next.
Consider e.g. a subject setting a table, and the height of the subject's right
hand is tracked over time. Can machines understand the process
of setting the table? That is, can we make use of declarative world knowledge within continuous, non-linear regression tasks? I will
demonstrate that this is indeed the case if we gate a non-parametric Bayesian regression model for the hand's height with a relational
world model encoding our knowledge about the locations of objects and rules describing actions to set the table. So, reading machines
handle the complexity and uncertainty of the real world using probabilistic relational models. Ideally, probabilistic inference
within such models should be lifted as in first-order logic, handling whole sets of indistinguishable objects together. On
several important AI tasks such as probabilistic inference, SAT, and linear programming, I will illustrate how to lift corresponding
solvers and that significant efficiency gains are obtainable, often by orders of magnitude. Both showcases together put
a 'Big Picture' view on AI in reach that should be understood next, namely "Statistical Relational AI".
The talk is mainly based on joint works with Babak Ahmadi, Martin Mladenov,
Sriraam Natarajan, Marion Neumann,
Scott Sanner, and Martin Schiegg.
M. Schiegg, M. Neumann, K. Kersting.
Markov Logic Mixtures of Gaussian Processes: Towards Machines Reading Regression Data.
In Proceedings of the 15th International Conference on Artificial Intelligence
and Statistics (AISTATS 2012), La Palma, Canary Islands, Spain, April 21-23,
2012. Volume 22 of JMLR: W&CP 22.
M. Mladenov, B. Ahmadi, K. Kersting.
Lifted Linear Programming.
In Proceedings of the 15th International Conference on Artificial Intelligence
and Statistics (AISTATS 2012), La Palma, Canary Islands, Spain, April 21-23,
2012. Volume 22 of JMLR: W&CP 22.
K. Kersting, B. Ahmadi, S. Natarajan.
Counting Belief Propagation.
In Proceedings of the 25th Conference on Uncertainty in Artificial
Intelligence (UAI 2009), Montreal, Canada, June 18-21 2009.
Hardware Acceleration and Image Processing - Architectures and Design Methods
This presentation consists of two parts. In the first, part hardware acceleration based on field programmable gate arrays (FPGAs) is presented. It is shown that architecture friendly algorithms can outperform CPUs with GHz-clock-rates by one or two orders of magnitude in the image processing domain. Basic architectural considerations, case studies and design methods for field programmable gate arrays are discussed.
In the second part, a design method for field programmable gate arrays on the physical board level is presented. The method is based on image processing of 3D Computed Tomography (CT) data to determine electrical parameters of high-speed interconnects. X-ray inspection has been used for printed circuit boards (PCB) for many years. As X-ray CT-technology has been developed further over the years, it has become possible to generate accurate geometric 3D-models from CT data even comprising of manufacturing tolerances. It is shown that from these geometric 3D-models of passive structures an electrical characterization in the GHz range can be accomplished by applying standard EM field solvers. Compared to electrical measurements this method has several advantages like the contactless electrical characterization.
V.Franc (presenter) , A.Zien, B.Schoelkopf
Support Vector Machines as Probabilistic Models
We show how the SVM can be viewed as a maximum likelihood estimate of
a class of probabilistic models. This model class can be viewed as a
reparametrization of the SVM in a similar vein to the $\nu$-SVM
reparametrizing the classical ($C$-)SVM.
It is not discriminative, but has a non-uniform marginal. We illustrate
the benefits of this new view by re-deriving and re-investigating two
established SVM-related algorithms.
Oliver Zendel, Austria Institute of TechnologyCoverage-oriented test data generation for preparing the certification
Computer vision applications are steadily increasing over the last years
but their use in critical situations is still limited due to the absence
of suitable certification procedures. At the Austrian Institute of
Technology (AIT) we are currently working on filling this important gap.
Essential for the certification of algorithms are meaningful test data
sets. It is our goal to generate thorough test data sets automatically
from meta-model descriptions. On the one hand, the test data set should
cover relevant scene elements as well as a broad range of difficult
image effects. On the other hand it should not introduce rendering
artifacts into the test data that will result in tests to produce
misleading results because of insufficient realism.
My talk will give some insight into our ongoing research activities as
well as outlooks into the future.
Harlyn Baker, HP, Paolo Alto
Multi-imager camera arrays for panoramic, multi-viewpoint, and 3D capture
Advances in building high-performance camera arrays have opened the opportunity
– and challenge – of using these devices for synchronized 3D and
multi-viewpoint capture. In this vein, I will discuss a high-bandwidth multi-imager
camera system supporting 72 wide-VGA imagers in simultaneous synchronized 30
and 60 Hz operation uncompressed to a single PC. A 6-imager 1080P60 system is
in house, preparing for upgrade from PCI-X to PCIe, where multiple dozens are
expected to be supported in sustained video delivery. Such a source of massive
synchronized video capture presents new opportunities in imaging, including
geometry recovery, immersive experiences, entertainment, surveillance, autonomous
navigation, and others. A requirement of using camera arrays for quantitative
work is that their relative poses be known, so calibration of these sensing
elements is a prerequisite of their use. I argue for use of structured arrays
– where imagers' intrinsics and relative positions do not change, and
it is feasible to perform this calibration once, before any use. I will present
progress in developing a variety of calibration approaches that capitalize on
high quality homographies (non metric) and related camera placement constraints
in developing globally optimal solutions, including rectifying homographies,
fundamental matrices, epipoles, and epipolar rectification parameters for the
entire system. The methods build on what we identify as the Rank-One-Perturbation-of-Identity
(ROPI) structure of homologies in posing a unified SVD estimator for the parameters.
I will summarize the theory, and present both qualitative depictions and quantitative
assessments of our results.
Our use of these camera systems include composing panoramic mosaics for videoconferencing and sports capture, linear baseline multiview capture for automultiscopic display, and geometry recovery using Epipolar Plane structurings. I hope to bring a multi-view camera system with me and demonstrate its capabilities on a laptop.
Much of this work has been done in collaboration with Zeyu Li, studying at UC Berkeley under Ruzena Bajcsy.
"End-to-end" machine learning of image segmentation
(for neural circuit reconstruction)
Srini Turaga, Gatsby Computational Neuroscience Unit, UCL
Supervised machine learning is a powerful tool for creating image segmentation algorithms that are well adapted to our datasets. Such algorithms have three basic components: 1) a parametrized function for producing segmentations from images, 2) an objective function that quantifies the performance of a segmentation algorithm relative to ground truth, and 3) a means of searching the parameter space of the segmentation algorithms for an optimum of the objective function.
In this talk, I will present new work in each of these areas: 1) a segmentation algorithm based on convolutional networks as boundary detectors, 2) the Rand index as a measure of segmentation quality, and 3) the MALIS algorithm for training boundary detectors to optimize the Rand index segmentation measure. Taken together, these three pieces constitute the first system for truly "end-to-end" learning of image segmentation, where all parameters in the algorithm are adjusted to directly minimize segmentation error.
Multi-People Tracking through Global Optimization
Pascal Fua, EPFL, Lausanne, Switzerland
Given three or four synchronized videos taken at eye level and from different angles, we show that we can effectively detect and track people, even when the only available data comes from the binary output of a simple blob detector and the number of present individuals is a priori unknown.
We start from occupancy probability estimates in a top view and rely on a generative model to yield probability images to be compared with the actual input images. We then refine the estimates so that the probability images match the binary input images as well as possible. Finally, having performed this computation independently at each time step, we compute trajectories over tive by solving a convex constrained flow problem, which allows us accurately follow individuals across thousands of frames. Our algorithm yields metrically accurate trajectories for each one of them, in spite of very significant occlusions.
In short, we combine a mathematically well-founded generative model that works in each frame individually with a simple approach to global optimization. This yields excellent performance using very simple models that could be further improved.
An Iteratively Reweighted Algorithm for Image Reconstruction
Prof. Ming-Jun Lai, Dept. Mathematics, University of Georgia, U.S.A
In this talk, I will discuss how to recover a low-rank matrix from a small number of its linear measurements, e.g.,
a subset of its entries. Such a problem share many common features with the recent study of recovering sparse
vectors. I will extend an iteratively reweighted algorithm from recovering sparse vectors to recovering low-rank
matrices, e.g. image reconstruction from its partial pixel values.
Mainly I will present a convergence analysis of an unconstrained $\ell_q$ minimization algorithm
to compute the sparse solution and extend the analysis for matrix completion problem.
Finally, I shall present some numerical results for recovering images from their ramdon sampling
entries with noises.
Fusion komplementärer Sensoriken zur Nahbereichsumfelderfassung als Basis für Fahrassistenzsysteme im Niedriggeschwindigkeitsbereich
Leo Vepa, Robert Bosch GmbH, Leonberg
Neuartige Fahrassistenzsysteme für Park- und Rangierfunktionen benötigen präzise Informationen über das Fahrzeugumfeld. Für die Planung, Kontrolle und Durchführung von solchen Manövern im Niedriggeschwindigkeitsbereich ist eine Modellierung der gesamten unmittelbaren Fahrzeugumgebung notwendig. Problematisch ist hierbei der große Winkelbereich, der von den Umfeldsensoren des Fahrzeuges abgedeckt werden muss. Hierfür eigenen sich besonders serienerprobte Sensoren mit großen Öffnungswinkeln, wie z.B. Kamera-, Ultraschall- und Radarsensoren.
Um die Leistungsfähigkeit der Umfelderfassung zu steigern wird eine Fusion
der Messdaten der verschiedenen Umfeldsensoren angestrebt. Aufgrund der unterschiedlichen
physikalischen Messprinzipien der Sensoren und der zeitlich steigenden Zahl
der Messungen ist es möglich durch eine Informationsfusion Messunsicherheiten
zu minimieren und präzisere Daten zu akquirieren. Somit ist es möglich
eine genauere und vollständigere Beschreibung des Fahrzeugumfeldes zu erhalten.
Als Basis für eine Datenfusion wird eine Fusionsarchitektur benötig,
welche sicherstellt, dass die unterschiedlichen Sensordaten korrekt in das übergeordnete
Umfeldmodell integriert werden. Dieses Umfeldmodell kann anschließend
als Datenbasis für unterschiedliche Fahrassistenzfunktionen verwendet werden.
Global optimale und lokale nichtlineare Verfahren zur szenenbasierten Fixed-Pattern-Noise Korrektur
Marc Geese, Robert Bosch GmbH, Leonberg
Bei der Herstellung von Bildsensoren treten produktionsbedingt räumliche Inhomogenitäten in der Sensorcharakteristik auf. Diese Inhomogenitäten erzeugen ein sogenanntes Fixed-Pattern-Noise (FPN), welches die Bildqualität insbesondere bei CMOS-Bildsensoren verschlechtert. Die meisten Bildsensoren lassen ich
durch ein lineares Sensormodell beschreiben wie es z.B. im EMVA1288 Standard beschrieben ist. In diesem linearen Fall besteht das FPN aus den Komponenten DSNU (dark signal non-uniformity) und PRNU (photo response non-uniformity).
Im Allgemeinen sind die FPN Komponenten thermisch und zeitlich nur meta-stabil, was einen Wartungsaufwand erzeugt und Nachkalibrierungen erforderlich macht. Daher existieren neben Anstrengungen in verbesserter Pixelhardware und einer labor-photometrischen Kalibrierung auch zunehmend szenenbasierte Verfahren zur Kalibrierung der FPN-Parameter. Eine Nachkalibrierung der Kamera während langjähriger Einsätze oder bei starken Temperaturschwankungen ist somit möglich (z.B. für Fahrerassistenzsysteme im automotive Bereich).
In dem Vortrag werden neue global optimale sowie neue lokale Methoden zur Schätzung der FPN Parameter vorgestellt werden und gegen literaturbekannte Methoden verglichen. Dabei werden Annahmen an das physikalische Lichtsignal gemacht, welche dann in Methoden umgesetzt werden denen das Sensormodell des
EMVA1288 Standards zugrunde liegt. Die so geschätzten FPN Parameter werden gegen die photometrische Laborkalibrierung gemäß EMVA1288 verglichen, und die
CRFs in Action: Intrinsic Images and Decision Tree Fields
Carsten Rother, Microsoft Research, Cambridge
In this talk I will present two upcoming papers (NIPS '11 and ICCV '11),
which both utilize Conditional Random Fields (CRFs) but are otherwise
The task of recovering intrinsic images is to separate a given input
image into its material-dependent properties, known as reflectance or
albedo, and its light-dependent properties, such as shading and shadows.
We develop a new CRF model which achieves state-of the art results. The
key novel ingredient is a sparseness prior on reflectance, which encodes
the property that a scene is often composed of a few different materials.
Decision Tree Fields (DTFs) are a new model that combines and
generalizes random forests and conditional random fields (CRF). The key
idea is to have a very large number of potential functions which are all
based on non-parametric decision trees. We show that learning and
inference is still tractable for models with millions of parameters. We
demonstrate excellent performance for various tasks such as in-painting
and person detection in depth images.
Bildfusion kombinierter Stereo-, Fokus- und Spektralserien am Beispiel des Kamera-Arrays des Fraunhofer IOSB
Zur bildbasierten Erfassung räumlicher Szeneneigenschaften gibt es zahlreiche Ansätze. Darunter befindet sich der Multi-Stereo-Ansatz, bei dem mehrere Kameras mit unterschiedlichen Positionen simultan Bilder aufnehmen und der dort vorhandene Stereo-Effekt zur räumlichen Rekonstruktion der Szene verwendet wird. Bei solchen Kamera-Arrays können darüber hinaus noch weitere Aufnahmeparameter variiert werden, so dass mehr Information über die Szene gewonnen werden kann: Werden die Fokuseinstellungen der Kameras variiert, entsteht durch den Fokuseffekt Zusatzinformation über die räumliche Gestalt der Szene, die bei der Rekonstruktion verwendet werden kann. Werden vor die (Grauwert-)Kameras Spektralfilter platziert, kann die zusätzliche spektrale Information z.B. zur Materialklassifikation genutzt werden. Allerdings weisen die entstehenden kombinierten Bildserien den Nachteil auf, dass sie mit Standard-Stereo-Algorithmen nicht mehr ausgewertet werden können. Im Vortrag werden Methoden vorgestellt, die zur Auswertung von solchen kombinierten Bildserien geeignet sind. Dabei kommt eine regionenbasierte Formulierung des stark gekoppelten Fusionsproblems zum Einsatz, das mittels Methoden der Energieminimierung gelöst wird.
Image Processing for Dynamic Contrast Enhanced Magnetic Resonance Image Sequences
Dynamic Contrast Enhanced Magnetic Resonance Imaging (DCE-MRI) is a
diagnostic approach in which the distribution of injected contrast agent is
imaged with high temporal resolution in order to identify potentially
pathological transport properties. Major software challenges for this
- determining tissue transport properties from intensity distributions
- reconstructing images from rapidly and incompletely sampled data, and
- removing physiological motion in order to track tissue points.
Successful approaches for the first two tasks have been demonstrated while
current research front is focused on the third task. Patient motion is to be
eliminated by registering each image of a DCE-MRI sequence to an appropriate
target. However, defining an image similarity measure is complicated by the
- Intensity changes due to motion cannot be separated easily from those
due to the distribution of contrast agent.
- Higher contrast in one moment creates new structures not present in
the previous moment, meaning the edges may not match appropriately.
- If segmentation is to be performed simultaneously in order to match
segments, first order regularization models are typically not suitable
because of many gradual intensity variations.
Since the force driving registration is stronger for intensity as opposed to
edge based similarity measures, the approach considered here is to adapt
intensities in local segments of the target image to better match intensities
of local segments in the transformed image. The segmentation is based upon a
higher order model which is more suitable for piecewise smooth as opposed to
piecewise constant images. The methods implemented are seen as an
approximation to a higher order Mumford-Shah registration approach, about
which continuing research will be reported. Finally, an approach for
eliminating motion will be discussed which involves to match the entire
sequence all at once to a derived sequence.
Instrumentation and Mathematical Concepts for Multimodal Optical Imaging inSmall Animals
In vivo molecular imaging modalities
Nuclear vs. optical imaging
Reasoning for combining imaging modalities
State-of-the-art multimodal instrumentation (small animals)
SPECT-CT-OT - The first trimodal imager (conventional approach)
Microlens-based optical detector for in vivo imaging
Mathematical concepts for image formation
Tomographic realizations for multimodal applications (OI-MRI, OI-PET)
A new kind of image processing journal
The new journal Image Processing on Line http://www.ipol.im/
publishes image analysis algorithms. Each journal article has four parts:
a) a careful text description of the algorithm on the main web page;
b) an on line demo running the algorithm in real time;
c) a non-moderated archive of all experiments performed by users;
d) a commented code in C or C++.
According to the recent statistics of the first published articles, this
format permits a quick and strong diffusion.
The publication criterion is not the novelty, but the interest to the
scientific community of certifying and diffusing the algorithm. Each
submission is carefully evaluated to ensure "reproducible research". The
scientific editors request referees to check whether a), b), c) and d)
fit perfectly or not. Indeed, the main goal is to publish certified
reference versions of algorithms.
It is hoped that this new format will foster experiment sharing, on line
benchmarks, collaborative projects and in general accelerate research by
providing certified algorithms.
This journal is in the starting phase, but some fifteen algorithms are
in course of publication and twenty more submitted. A publication
on line is different from --and complementary to-- a journal
publication. I'll describe briefly several on line algorithms, discuss
the technical and organization challenges of such publications, and take
The Quadratic-Chi Histogram Distance Family
Michael Werman, The Institute
of Computer Science, The Hebrew University of Jerusalem
Jerusalem 91904, Israel
We present a new histogram distance family, the Quadratic-Chi (QC).
QC members are Quadratic-Form distances with a cross-bin 2-like normalization.
The cross-bin 2-like normalization reduces the effect of large bins having
undo influence. Normalization was shown to be helpful in many cases, where the
2 histogram distance outperformed the L2 norm. We show that the new
QC members outperform state of the art distances for these tasks, while having a
short running time. The experimental results show that both the
and the normalization are important.
If there is time i will show some reults on earth mover distance
and some pretty pictures (computational photography)
Filip Korc, Institut für Geodäsie und Geoinformation, Univ. of Bonn
On Markov Random Field Estimation for 3D Segmentation of MRI Knee Data
We present an example of employing a global statistical model in the context
of 3d semantic
segmentation of magnetic resonance images of the knee. We formulate a single model that
allows to jointly segment all classes and describe possible approaches to estimating
the model automatically from labeled data, while pointing to the key computational
challenges. We show results of an approach, where an involved learning problem is reduced
to a simple histogram based nonparametric density estimation.
04.10.2010 4:00 pm
Computational reconstruction of zebrafish early embryogenesis
by nonlinear PDE methods of image processing
Prof. Karol Mikula, Department of Mathematics, Slovak University of Technology, Bratislava, Slovakia
In the talk we present mathematical models and numerical
methods which lead to early embryogenesis reconstruction and
extraction of the cell lineage tree from the large-scale 4D image
sequences. Robust and efficient finite volume schemes for solving
nonlinear PDEs related to filtering, object detection and segmentation
of 3D images were designed to that goal and studied mathematically.
They were parallelized for massively parallel computer clusters and
applied to the mentioned problems in developmental biology. The
presented results were obtained in cooperation of groups at Slovak
University of Technology, Bratislava, CNRS, Paris and University of
LEARNING 3-D MODELS OF OBJECT STRUCTURE FROM IMAGES
Recognizing objects in images presents a difficult challenge attributable
to large variations in object appearance, shape, and pose. The problem is
further compounded by ambiguity from projecting 3D objects into a 2D image.
I will present an approach to resolve these issues by modeling object
structure with a collection of connected 3D geometric primitives and a
model for the camera. From sets of images we simultaneously learn a
statistical model for the object representation and parameters of the
system. We explore our approach in the context of microscopic images of
biological structure and single view images of man-made objects composed of
block-like parts, such as furniture. We express detected features from both
domains as statistically generated by an image likelihood conditioned on
for the object structure and imaging system. Our results demonstrate
that we can infer both 3D object and camera parameters simultaneously from
images, and that doing so improves understanding of structure in images.
Benchmarking Stereo Vision and Optical Flow Algorithms
Stereo vision and optical flow methods attempt to measure scene depth
and motion by matching and tracking pixels across images. To evaluate
the performance of such methods, we need "ground truth" - the true
depth or true object motion. In this talk I will describe different
techniques for creating image datasets with ground truth, including
structured lighting, laser and CT scanners, and hidden fluorescent
texture. The Middlebury datasets are now well-established benchmarks
in computer vision, and I will discuss both benefits and potential
pitfalls of such benchmarks. I will also briefly touch on how data
with ground truth can aid in developing new algorithms.
The Lazy Flipper: A Minimal Exhaustive Search Algorithm for
Models with Higher Order Operands
Bjoern Andres, HCI, Univ. of Heidelberg
The optimization of functions of binary variables that decompose
according to a graphical model is NP-hard in the general case. Good
bounds on the global optimum are essential in computer vision. A search
algorithm (The Lazy Flipper) is introduced that starts from an initial
assignment of zeros and ones to the variables and converges to a global
optimum. Along the way, it passes through a series of monotonously
improving local minima; some of these are guaranteed to be the best
configurations within a given and increasing Hamming distance. For a
submodular Ising model, the algorithm finds surprisingly good upper
bounds on the minimum energy within limited search depth. For a
difficult non-submodular image segmentation problem with higher order
potentials, it finds 22% lower bounds than min-sum belief propagation in
1/50 of the runtime.
Graph-cut Based Image Segmentation with Connectivity Priors
Sara Vicente, Dept. of Computer Science, University College London, United Kingdom
Graph cut is a popular technique for interactive image segmentation.
However, it has certain shortcomings. In particular, graph cut has problems
with segmenting thin elongated objects due to the ``shrinking bias''.
In this talk I'll describe how imposing an additional connectivity prior can
help overcoming this problem.
We have formulated several versions of the connectivity constraint
showed that the corresponding optimization problems are all NP-hard.
For some of these versions we proposed two optimization algorithms: (i) a
practical heuristic technique which we call DijkstraGC, and (ii) a slow
method based on problem decomposition which provides a lower bound on the
Enriching MRF based models with higher-order constraints, such as
connectivity, follows a recent trend of research in image segmentation and
Structured Prediction with Global Interactions: Connectivity-Constrained Segmentation
Dr. Sebastian Nowozin, Microsoft Research, Cambridge UK
"Structured prediction" refers to prediction functions with an output domain
that contains dependencies, relations and constraints among multiple
variables. In the last decade, random field models (MRF, CRF) have taken a
prominent place and are popularly applied to solve structured prediction tasks
in computer vision.
However, in order to be computationally tractable they typically incorporate
only local interactions and do not model global properties, such as
connectedness, a potentially useful high-level prior for object segmentation.
Recently, similar high-arity potential functions of specialized type have
received attention by the research community. In this work, we show how a
NP-hard potential function ensuring connectedness of the output labeling can
be approximated in the framework of recent MAP-MRF linear programming
relaxations. Using techniques from polyhedral combinatorics, we show that an
approximation to the MAP solution of the resulting MRF can still be found
efficiently by solving a sequence of max-flow problems.
The contribution is a part of a larger trend of deriving richer predictive
models customized to the problem structure of computer vision problems.
Verbesserung der tomographischen Rekonstruktion von
Partikelvolumina durch nichtlineare Verzerrung des Suchraums
Sebastian Gesemann, DLR (German Aerospace Center), Goettingen
In diesem Vortrag geht es um ein Teilproblem der optischen
Messtechnik "TomoPIV" (Tomographic Particle Image Velocimetry).
TomoPIV liefert dreidimensionale Felder von Geschwindigkeitsvektoren
und ist für die Untersuchung von Strömungen interessant. Um diese
Felder berechnen zu können, ist eine schnelle und gute
dreidimensionale Rekonstruktion der Partikelvolumina erforderlich. Im
Vortrag werden werden verschiedene Rekonstruktionsverfahren
vorgestellt und bewertet, darunter auch ein neuer, vielversprechender
Ansatz. Verschiedene Testfälle wurden für eine Bewertung der Verfahren
Learning from Labeled and Unlabeled Data, Global vs. Multiscale Approaches.
In recent years there is increasing interest in
learning from both labeled and unlabeled data
(a.k.a. semi-supervised learning, or SSL).
The key assumption in SSL, under which an abundance
of unlabeled data may help, is that there is some
relation between the unknown response function
to be learned and the marginal density of the
In the first part of this talk I'll present a statistical analysis
of two popular graph based SSL algorithms: Laplacian
regularization method and Laplacian eigenmaps.
In the second part I'll present a novel multiscale approach
for SSL as well as supporting theory. Some
intimate connections to harmonic analysis on abstract
data sets will be discussed.
Joint work with Nati Srebro (TTI), Xueyuan Zhou (Chicago),
Matan Gavish (WIS/Stanford) and Ronald Coifman (Yale).
Interactive Learning and Segmentation Tool Kit - Development Snapshot
For biomedical and industrial applications, segmentation and
classification techniques are of high relevance. We develop a supervised
classification framework which shall enable the user to interactively
train a segmentation system without any need for custom programming. It
is able to tackle multi-object segmentation and multi-class object
classification in settings where long-range context is not required.
Three-dimensional and spectrally resolved input will be supported. We
want to demonstrate the current development of the graphical use
interface written in Python/Qt and give an outlook on future
development, including parallelization.
3D-Internet, Intelligente Simulierte Realität und mehr:
Forschungsthemen am neuen "Intel Visual Computing Institut" in Saarbrücken
"Visual Computing" mit seinen enormen Leistungs- und Echtzeit- Anforderungen definiert heute weitgehend die Prozessor-Architekturen der Chip-Hersteller -- vom kleinen Smartphone bis zum Supercomputer. In diesem Kontext hat sich Intel nach langer Suche Mitte diesen Jahres entschieden, in Saarbrücken das erste "Intel Visual Computing Institut" in Zusammenarbeit mit der Universität des Saarlandes, dem Deutschen Forschungsinstitut für Künstliche Intelligenz (DFKI) und den Max-Planck-Instituten für Informatik und für Softwaresysteme einzurichten.
An dem auf Grundlagenforschung ausgerichteten Uni-Institut laufen seit kurzem die ersten Forschungsprojekte. In meinem Vortrag werde ich kurz die Struktur und Ausrichtung des neue Institut erläutern sowie einigen ausgewählte Forschungsthemen näher diskutieren.
Mapping the space of cellular phenotypes using genome-wide
automated image analysis and multiparametric cellular descriptors
Wolfgang Huber, EMBL
Phenotyping of cellular model systems through high content screening
(automated microscopy and image analysis) is a powerful approach to
associate genes with biological processes. It also opens the
possibility to systematically assay genetic and chemical perturbations
and their interactions. Statistical data analysis and computing
infrastructures are necessary to address this task. I will describe an
approach to the complete workflow of: image segmentation and feature
extraction, screen quality assessment, feature selection and distance
metric learning, data presentation and integrative biological analyses.
The methods are provided as freely available R/Bioconductor packages.
I will describe biological insights from applying this approach, and
the computational challenges of experiments ahead.
Pose invariant shape prior segmentation using continuous graph cuts and gradient descent on Lie groups
Prof. Anders Heyden, Malmoe University and Lund University
In this talk I will propose a novel formulation of the Chan-Vese
model for pose invariant shape prior segmentation as a continuous graph
cut problem. The model is based on the classic L2 shape dissimilarity
measure and with pose invariance under the full (Lie-) group of similarity
transforms in the plane. To overcome the common numerical problems
associated with step size control for translation, rotation and scaling in
the discretization of the pose model, a new gradient descent procedure
for the pose estimation is introduced. This procedure is based on the
construction of a Riemannian structure on the group of transformations
and a derivation of the corresponding pose energy gradient. Numerically
this amounts to an adaptive step size selection in the discretization of
the gradient descent equations. Together with efficient numerics for TV-
minimization we get a fast and reliable implementation of the model.
Moreover, the theory introduced is generic and reliable enough for ap-
plication to more general segmentation- and shape models.
1. Global Segmentation and Curvature Analysis of Volumetric
Data Sets Using Trivariate B-spline Functions.
2. Coupled Non-Parametric Shape and Moment-Based Inter-Shape Pose Priors for Multiple Basal Ganglia Structure Segmentation.
Octavian Soldea, PhD
Computer Science Department
Each part will take 20-25 minutes.
Global Segmentation and Curvature Analysis of Volumetric Data Sets Using Trivariate B-spline Functions
This work presents a method to globally segment volumetric images into
regions that contain convex or concave (elliptic) iso-surfaces, planar or
cylindrical (parabolic) iso-surfaces, and volumetric regions with saddlelike
(hyperbolic) iso-surfaces, regardless of the value of the iso-surface
level. The proposed scheme relies on a novel approach to globally compute,
bound, and analyze the Gaussian and mean curvatures of an entire volumetric
data set, using a trivariate B-spline volumetric representation. This
scheme derives a new differential scalar field for a given volumetric
scalar field, which could easily be adapted to other differential properties.
Moreover, this scheme can set the basis for more precise and accurate
segmentation of data sets targeting the identification of primitive parts.
Since the proposed scheme employs piecewise continuous functions, it is
precise and insensitive to aliasing.
Coupled Non-Parametric Shape and Moment-Based Inter-Shape Pose Priors for Multiple Basal Ganglia Structure Segmentation
This work presents a new active contour-based, statistical method for
simultaneous volumetric segmentation of multiple subcortical structures in
the brain. In biological tissues, such as the human brain, neighboring
structures exhibit co-dependencies which can aid in segmentation,
if properly analyzed and modeled. Motivated by
this observation, we formulate the segmentation problem
as a maximum a posteriori estimation problem, in which
we incorporate statistical prior models on the shapes and
inter-shape (relative) poses of the structures of interest.
This provides a principled mechanism to bring high level
information about the shapes and the relationships of
anatomical structures into the segmentation problem. For
learning the prior densities based on training data, we
use a nonparametric multivariate kernel density estimation
framework. We combine these priors with data in a variational framework
and develop an active contour-based iterative segmentation algorithm. We
test our method on the problem of volumetric segmentation of basal ganglia
structures in magnetic resonance (MR) images. We present
a set of 2D and 3D experiments as well as a quantitative performance
analysis. In addition, we perform a comparison
to several existent segmentation methods and demonstrate
the improvements provided by our approach in terms of
The GeoMap: A Unified Representation (not only) for Image Segmentation
Dr. rer. nat. Hans Meine, Computer Science Department, University of Hamburg
Image analysis is dealing with a wealth of algorithms; for instance,
there is a large variety of region- and boundary-based segmentation
methods, each with different strengths and weaknesses.
Unfortunately, the comparison or combination of such methods is made
difficult by the fact that virtually every approach uses its own,
This talk describes the "GeoMap", a unified formalism for
representation of segmentation states, i.e. plane partitions. The
main characteristic is that it includes both topological and
geometrical aspects of a segmentation result. It also comes with
corresponding modification operations which guarantee to preserve
consistency of the GeoMap.
Although it was originally developed as a means to combine the
strengths of different segmentation algorithms in a common framework,
its extension to a sub-pixel precise (polygonal) geometry has opened
it up for other applications. Therefore, in this talk we will not
only show automatic and (semi-)interactive segmentation approaches,
but also skeletonization and a new boundary reconstruction method
based on alpha-shapes.
An Invitation to Network Analysis
Informatik & Informationswissenschaft,
As a methodology, network analysis is currently diffusing into an
incredibly wide range of applications in the social, life, and other
sciences. In general, the objects of interest are attributed graphs
which are most often analyzed using approaches based on algorithmic graph
theory, linear algebra, or statistics. I will discuss three exemplary
methods for indexing (vertex centrality), grouping (structural similarity)
and modelling (event networks) in the context of various applications.
Discriminative learning of max-sum structural classifiers
International Research and Training Centre for Information
Technologies and Systems, Kiev, Ukraine
Dept. Image Processing and Recognition
The talk is devoted to learning of max-sum structural classifiers. An
output of such a classifier is a solution of the best labeling
problem, often cited also as energy minimization problem.
We briefly survey existing methods for discriminative learning of the
classifiers. Most of the methods have identical calculation scheme:
they are iterative and they require to calculate the output of the
max-sum classifier on each iteration. However such a calculation is
often a hard computational problem itself.
We will show that it is possible to learn a wide class of max-sum
classifiers without iterative calculation of their output. Moreover,
it is possible to learn without calculation of classifier's output at
all. The scheme we will propose can also be used to approximate
solution of learning problems for max-sum classifiers.
At the end we will present ideas of further research.
Extended and Constrained Diagonal Weighting Algorithm for Image Reconstruction
Ovidius University, Constanta, Romania
In 2001 Y. Censor, D. Gordon and R. Gordon introduced a new iterative
parallel technique suitable for large and sparse unstructured systems of linear
equations - the Diagonal Weighting algorithm (DW) - as a generalization of
the classical Cimmino's re
ections method. It oers an approximation of the
least squares solutions of ||Ax - b|| = min! in the consistent case, whereas
in the inconsistent one, it only approximates the least squares solutions of a
weighted problem ||jAx - b||M = min!, where M is a symmetric and positive
In our talk we present developments of the above DW algorithm in the
following two directions:
- an extension to the inconsistent case of ||jAx- b|| = min!, which pro-
duces sequences of approximations which always converge to a least
- a constraining strategy, for both DW algorithm and its extension
Numerical experiments are performed on two "phantom" images generated
with the SNARK'93 software package for image reconstruction in Comput-
A Multilayered Latent Aspect Model for Multimodal Image Collections
IMT, Lucca, Italy
Bridging the gap between the low level representation of the visual
content and the underlying high-level semantics is a major research
issue of current interest. This talk introduces a novel latent aspect
model addressing visual content understanding through a multi-level
approach that exploits a layered representation of both the visual and
semantic information. On the visual level, it is provided a
multi-resolution representation of the pictorial data, by exploiting a
computational model of a cortical memory mechanism to pool together
local visual patches, organizing them into perceptually meaningful
intermediate structures. Such a representation is paired with a
hierarchical organization of the latent space (i.e. the semantic part of
the model), allowing the discovery and organization of the visual topics
into a hierarchy of aspects. The proposed model is shown to effectively
address the unsupervised discovery of relevant visual classes from
pictorial collections, segmenting out the image regions containing the
discovered classes. Further, it is show how this model can be extended
to process and represent multi-modal collections comprising textual and
Statistical Learning Approaches for Computational Pathology
The histological assessment of human tissue has emerged as the key challenge for detectionand treatment of cancer. We employ ensemble learning techniques and survival statistics to automate and objectify two of the most crucial tasks in modern pathology
(i) A framework to quantify biomarkers in tissue microarrays (TMA) is developed for biomedical research. Due to the absence of ground truth we utilize the information gained from extensive labeling experiments with domain experts. Based on this gold standard we assess the inter and intra variability of pathologists and train various models for nuclei detection, cancer classification and survival estimation.
(ii) Micrometastases are detected in sentinel lymph nodes for clinical therapy decisions.
I will conclude with an overview of biomedical research projects in the ETH Machine learning group.
Bayesian Optimization of Magnetic Resonance Imaging Sequences
Max Planck Institute for Informatics
Saarland University, Saarbruecken, Germany
We show how sampling trajectories of magnetic resonance imaging sequences
can be optimized by Bayesian computations. Combining
approximate Bayesian inference and natural image statistics
with high-performance numerical computation, we propose the
first Bayesian experimental design framework for this problem of
high relevance to clinical and brain research. Our solution requires
large-scale approximate inference for dense, non-Gaussian models.
We propose a novel variational inference algorithm, which is scaled up to
full high-resolution images through primitives of numerical mathematics and
signal processing. Our approach is evaluated on raw data from a Siemens 3T
Joint work with Hannes Nickisch, Rolf Pohmann, Bernhard Schoelkopf,
MPI for Biological Cybernetics, Tuebingen.
Estimating uncertainty in the presence of multiple labels.
Dr. Nikolaos Gianniotis,
University of Heidelberg
It is usual that in a classification task, the dataset is created by
a group of experts, thus collecting multiple labels per data item.
However, experts are rarely unanimous on their evaluations, and
the collected labels are often conflicting. Such ambiguity in the labels,
introduces an additional source of noise in the classification task.
Furthermore, without absolute confidence in the labels it becomes
difficult to evaluate the performance of the learnt classifiers.
This talk presents some results from a preliminary search into the above
Learning with Few Examples
Current machine learning approaches often need a huge number of training
examples to learn from. This requirement is contrary to the abilities of
the human visual system, which is able to recognize many object
categories from just few views. There is a common belief that this
ability is based on generalization from previously learned similar
The talk will present a classifier extension which explicitly
incorporates information from a set of similar classes to learn
a new category model given just few examples. The method is based on
maximum-a-posteriori estimation of decision tree parameters. Shared
knowledge is represented using a special prior distribution which
enables the regularization of the ill-posed parameter estimation.
Applications can be found in character recognition and image
categorization and will be presented in some experiments.
Boosted Projections for Classification
Robert Bosch GmbH, Hildesheim
Numerous sophisticated Boosting algorithms can be found in the literature. Selection of appropriate weak learners has not yet received the same level of attention - though the performance and complexity of the weak learners will greatly affect performance of the overall system.
I propose to use a weak learning scheme inspired by the well known Projection Pursuit for data regression. Advantages are cheap computation, simple regularization and flexibility. The scheme is compared to state-of-the-art classification algorithms on a selection of UCI data sets and shows compatible performance.
In a second part, extensions to Boosted Projections are proposed. The extensions are mainly designed to improve performance on high-dimensional data, e.g. images or time-series. The first extension tries to find low-dimensional informative subspaces during the course of boosting training. The second extensions adds a shift-invariant feature generation layer to deal with local image deformations. Some preliminary results are shown.
Note: This colloquium will exceptionally take place in room 041 of the BIOQUANT (Im Neuenheimer Feld 267) at 5 pm.
Molecular histology of cells and tissue: Label free imaging of biomolecular distributions
FOM-Institute for Atomic and Molecular Physics,
1098 SJ Amsterdam, The Netherlands
Histopathology using H&E stained tissue sections is one of the established methods used to reveal molecular anomalies at the cellular and tissue level. It is commonly understood that a better fundamental understanding of the molecular basis of disease is rapidly changing health care. Diagnosis of diseases, classification of the stage of a disease, and the investigation of the efficacy of a treatment still rely on established and validated methods. Physics is contributing to this multidisciplinary research by the development of new tools for health care. This lecture will focus on one of these developments: molecular histology with high performance mass spectrometry.
Imaging mass spectrometry (IMS) is a powerful technique that enables researchers to identify and localize biological compounds directly on tissue without the need for radioactive or fluorescence labels or immunochemical reagents. It opens up a new way for molecular histological research. The advantages of using a mass spectrometer for molecular imaging in the discovery phase of any biomedical experiment are large. It eliminates the need for labeling as the molecular mass is used as an endogenous label. This leaves the biomolecules of interest functionally unmodified. In this way it removes the interference of potential fluorescent labels with the biological function. Imaging mass spectrometry also allows the detection of post-translational modifications (PTM) as these generally involve mass changes. Often it is not possible to generated antibodies that allow the (labeled) visualization of the PTM distribution directly in tissue. With a mass spectrometer it is ‘easy’ to see the location of the mass change. An additional advantage is that mass spectrometry provides multiplexed information from a surface as for each peak in the mass spectrum an image can be generated. The spatial resolution of this technique ranges from 200 nm to 100 micrometer depending on which class of molecules is imaged with which IMS technology.
Recently it has been recognized as a tool in proteomics for in situ spatial analysis of biomolecules. We developed a multidimensional molecular imaging protocol for direct high resolution tissue analysis. It has been employed for biomedical studies to identify small molecules, peptides and proteins and the positions where these disease related molecules are present. In this lecture we will illustrate the recent advances in molecular histology using a mass spectrometer as a microscope.
Local Invariant Features for 3D Image Analysis
Chair of Pattern Recognition and Image Processing
Institute for Computer Science
Albert-Ludwigs-University, Freiburg i.Br., Germany
Seeing the Objects Behind the Parts: Compositional Scene Understanding
UC Berkeley, Dept. of EECS,
University of California, Berkeley
The compositional nature of visual objects significantly limits their
representation complexity and renders learning of structured object
models tractable. Adopting this modeling strategy we both (i)
automatically decompose objects into a hierarchy of relevant
compositions and we (ii) learn such a compositional representation for
each category without supervision. The compositional structure supports
feature sharing already on the lowest level of small image patches.
Compositions are represented as probability distributions over their
constituent parts and the relations between them. The global shape of
objects is captured by a graphical model which combines all
compositions. Inference based on the underlying statistical model is
then employed to obtain a category level object recognition system. The
generative nature of the presented model provides direct insights into
the learned compositional structure of objects.
Finally, the approach has been successfully extended to near real-time
analysis of videos where category-level object recognition,
segmentation, and tracking of multiple objects are jointly handled. This
investigation shows how the key concept of compositionality can actually
be exploited for both, making learning feasible and rendering
recognition computationally tractable.
On the Role of Exponential Functions in Image Interpolation.
Department of Electrical Engineering,
Technion– Israel Institute of Technology
A reproducing-kernel Hilbert space approach to image interpolation is
introduced. In particular, the reproducing kernels of Sobolev spaces
are shown to be exponential functions. These functions, in turn, give
rise to interpolation kernels that outperform presently available
methods. Both theoretical and experimental results are presented. A
tight l_2 upper-bound on the interpolation error is then derived,
indicating that the proposed exponential functions are optimal in this
regard. Furthermore, a unified approach to image interpolation by
ideal and non-ideal sampling procedures is derived and demonstrated,
suggesting that the proposed exponential kernels may have a
significant role in image modeling as well. Our conclusion is that the
proposed Sobolev-based approach could be instrumental and a preferred
alternative in many interpolation tasks.
Last update: 06.06.2013, 14:11