<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"><title>Dan Steinberg</title><link href="https://dsteinberg.github.io/" rel="alternate"></link><link href="https://dsteinberg.github.io/feeds/all.atom.xml" rel="self"></link><id>https://dsteinberg.github.io/</id><updated>2026-01-01T00:00:00+11:00</updated><entry><title>Causal Discovery in the Real World</title><link href="https://dsteinberg.github.io/causal-discovery.html" rel="alternate"></link><published>2026-01-01T00:00:00+11:00</published><updated>2026-01-01T00:00:00+11:00</updated><author><name>Dan Steinberg</name></author><id>tag:dsteinberg.github.io,2026-01-01:/causal-discovery.html</id><summary type="html">&lt;h3&gt;Causal Discovery in the Real&amp;nbsp;World&lt;/h3&gt;
&lt;p&gt;Inferring causal structure from observational data is a fundamental challenge
in science and evidence-based decision-making. Most existing methods for
learning directed acyclic graphs (DAGs) assume that the true causal graph is
identifiable from data — an assumption that rarely holds cleanly in practice,
where causal …&lt;/p&gt;</summary><content type="html">&lt;h3&gt;Causal Discovery in the Real&amp;nbsp;World&lt;/h3&gt;
&lt;p&gt;Inferring causal structure from observational data is a fundamental challenge
in science and evidence-based decision-making. Most existing methods for
learning directed acyclic graphs (DAGs) assume that the true causal graph is
identifiable from data — an assumption that rarely holds cleanly in practice,
where causal assumptions are violated, data is limited, and the space of
plausible structures is large. Our recent work targets these realistic
shortcomings from two&amp;nbsp;directions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://arxiv.org/abs/2602.01483"&gt;CaPE: Causal Preference Elicitation&lt;/a&gt;&lt;/strong&gt;
addresses the fact that &lt;span class="caps"&gt;DAG&lt;/span&gt; estimation from observational data alone is often
under-determined — many graph structures are consistent with the data. CaPE
brings a domain expert into the loop using a Bayesian active learning
framework that strategically queries the expert about edge relationships in the
graph. A three-way likelihood models expert judgments about edge presence and
directionality, with particle-based inference and an expected information gain
criterion selecting the most informative queries. The result is faster
convergence to the true causal structure and better recovery of causal effects
under a limited query budget. CaPE was accepted at &lt;span class="caps"&gt;ICML&lt;/span&gt;&amp;nbsp;2026.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://arxiv.org/abs/2605.07204"&gt;Arrow: A Foundation Model for Causal Discovery&lt;/a&gt;&lt;/strong&gt;
takes a complementary approach: rather than requiring task-specific training or
expert elicitation, Arrow is a transformer-based foundation model trained on
synthetic datasets with diverse known causal structures. At inference time it
performs zero-shot causal discovery on new tabular datasets — no fine-tuning
required. Arrow uses &lt;span class="caps"&gt;DAG&lt;/span&gt; factorization and skeleton-order decomposition to
predict graph structure, achieving performance comparable to or better than
existing methods at a fraction of the computational&amp;nbsp;cost.&lt;/p&gt;
&lt;p&gt;Together, these works push causal discovery toward practical deployment: CaPE
by making expert knowledge tractably useful, and Arrow by eliminating the
computational barrier to applying strong causal priors on new&amp;nbsp;problems.&lt;/p&gt;</content><category term="Projects"></category></entry><entry><title>Active Generation — Generative Models for Black Box Optimisation</title><link href="https://dsteinberg.github.io/active-generation.html" rel="alternate"></link><published>2024-01-01T00:00:00+11:00</published><updated>2024-01-01T00:00:00+11:00</updated><author><name>Dan Steinberg</name></author><id>tag:dsteinberg.github.io,2024-01-01:/active-generation.html</id><summary type="html">&lt;h3&gt;&lt;strong&gt;Active Generation&lt;/strong&gt; - Generative models for black box&amp;nbsp;optimisation&lt;/h3&gt;
&lt;div class="col-sm-6 col-md-6" style="margin-left: -15px;"&gt;
&lt;div class="panel panel-default"&gt;
    &lt;div class="panel-body"&gt;
        &lt;img src="https://dsteinberg.github.io/images/active_generation.png" alt="Active generation"&gt;
    &lt;/div&gt;
    &lt;div class="panel-footer"&gt;
        Active generation as implemented by variational search distributions (&lt;span class="caps"&gt;VSD&lt;/span&gt;).
    &lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Active generation advances the union of generative modelling and black-box
optimisation so that &lt;span class="caps"&gt;AI&lt;/span&gt; systems can design new artefacts &amp;#8212; from molecules and
materials to robotic components and algorithms &amp;#8212; directly from high-level
objectives. We …&lt;/p&gt;</summary><content type="html">&lt;h3&gt;&lt;strong&gt;Active Generation&lt;/strong&gt; - Generative models for black box&amp;nbsp;optimisation&lt;/h3&gt;
&lt;div class="col-sm-6 col-md-6" style="margin-left: -15px;"&gt;
&lt;div class="panel panel-default"&gt;
    &lt;div class="panel-body"&gt;
        &lt;img src="https://dsteinberg.github.io/images/active_generation.png" alt="Active generation"&gt;
    &lt;/div&gt;
    &lt;div class="panel-footer"&gt;
        Active generation as implemented by variational search distributions (&lt;span class="caps"&gt;VSD&lt;/span&gt;).
    &lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Active generation advances the union of generative modelling and black-box
optimisation so that &lt;span class="caps"&gt;AI&lt;/span&gt; systems can design new artefacts &amp;#8212; from molecules and
materials to robotic components and algorithms &amp;#8212; directly from high-level
objectives. We combine powerful generative priors (transformers, flow matching
etc.) with machine-learning optimisation loops that decide which experiments to
run next, allowing the model to continually refine both its predictive beliefs
and its search distribution. This fusion turns design problems that once relied
on trial-and-error into targeted, data-driven discovery pipelines, yielding
scalable tools for any domain where evaluating a candidate is expensive but
generating hypotheses is&amp;nbsp;cheap.&lt;/p&gt;
&lt;p&gt;In &lt;a href="https://openreview.net/forum?id=1vrpdV9U3i"&gt;Variational Search Distributions&lt;/a&gt;
(&lt;span class="caps"&gt;VSD&lt;/span&gt;) we apply variational inference to the problem of active generation, and
introduce a flexible framework for designing sequences such as proteins, with
formal guarantees. Software for &lt;span class="caps"&gt;VSD&lt;/span&gt; can be found
&lt;a href="https://github.com/csiro-funml/variationalsearch"&gt;here&lt;/a&gt;. We extend this work
for multi-objective optimisation problems in &lt;a href="https://arxiv.org/abs/2510.21052"&gt;Amortized Active Generation of
Pareto Sets&lt;/a&gt;, and then to reward model free
settings in &lt;a href="https://arxiv.org/abs/2510.25240"&gt;Generative Bayesian Optimization: Generative Models as Acquisition
Functions&lt;/a&gt;. We have also applied these methods
to &lt;a href="https://www.biorxiv.org/content/10.1101/2025.11.02.685536v2.abstract"&gt;actual protein engineering tasks&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We also investigate the spectral properties of sequence (protein, &lt;span class="caps"&gt;DNA&lt;/span&gt;)
lansdcapes in &lt;a href="https://proceedings.mlr.press/v258/zhu25c.html"&gt;Protein fitness landscape: spectral graph theory
perspective&lt;/a&gt;. Using our
theoretical framework we present propagational convolutional neural networks
(&lt;span class="caps"&gt;PCNN&lt;/span&gt;), for which we derive theoretical guarantees on the generalization and
convergence properties for protein property&amp;nbsp;prediction.&lt;/p&gt;
&lt;div style="clear: both;"&gt;&lt;/div&gt;</content><category term="Projects"></category></entry><entry><title>Causal Inference — Machine Learning for Evidence-Based Policy</title><link href="https://dsteinberg.github.io/causal-inference.html" rel="alternate"></link><published>2022-01-01T00:00:00+11:00</published><updated>2022-01-01T00:00:00+11:00</updated><author><name>Dan Steinberg</name></author><id>tag:dsteinberg.github.io,2022-01-01:/causal-inference.html</id><summary type="html">&lt;h3&gt;&lt;strong&gt;Causal Inference&lt;/strong&gt; - Machine learning as a tool for evidence-based&amp;nbsp;policy&lt;/h3&gt;
&lt;div class="col-sm-6 col-md-6" style="margin-left: -15px;"&gt;
&lt;div class="panel panel-default"&gt;
    &lt;div class="panel-body"&gt;
        &lt;img src="https://dsteinberg.github.io/images/causal-dag.png" alt="Simple causal diagram"&gt;
    &lt;/div&gt;
    &lt;div class="panel-footer"&gt;
        Graphical representation of a relationships assumed in a simple
        causal model.
    &lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Machine learning (&lt;span class="caps"&gt;ML&lt;/span&gt;) can be a useful tool for observational causal inference
studies, one of the cornerstones of evidence-based policy. &lt;span class="caps"&gt;ML&lt;/span&gt; can help us
capture complex relationships in the …&lt;/p&gt;</summary><content type="html">&lt;h3&gt;&lt;strong&gt;Causal Inference&lt;/strong&gt; - Machine learning as a tool for evidence-based&amp;nbsp;policy&lt;/h3&gt;
&lt;div class="col-sm-6 col-md-6" style="margin-left: -15px;"&gt;
&lt;div class="panel panel-default"&gt;
    &lt;div class="panel-body"&gt;
        &lt;img src="https://dsteinberg.github.io/images/causal-dag.png" alt="Simple causal diagram"&gt;
    &lt;/div&gt;
    &lt;div class="panel-footer"&gt;
        Graphical representation of a relationships assumed in a simple
        causal model.
    &lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Machine learning (&lt;span class="caps"&gt;ML&lt;/span&gt;) can be a useful tool for observational causal inference
studies, one of the cornerstones of evidence-based policy. &lt;span class="caps"&gt;ML&lt;/span&gt; can help us
capture complex relationships in the data, thereby helping mitigate bias from
model mis-specification. Also, use of regularisation in machine learning can
lead to causal estimates with less error compared to unbiased methods when we
have many related confounding factors in our data. I helped to write a &lt;a href="https://medium.com/gradient-institute/machine-learning-as-a-tool-for-evidence-based-policy-3242f4a545b8"&gt;blog
post&lt;/a&gt;
on this subject, and at &lt;a href="https://gradientinstitute.org/case-studies/act-education-directorate/"&gt;Gradient
Institute&lt;/a&gt;
we have used machine learning for observational studies such as &lt;a href="https://doi.org/10.1038/s41598-022-05780-0"&gt;linking youth
well-being to academic success&lt;/a&gt;.
Reporting non-linear causal effects requires a new methodology, software for
which we developed and can be found
&lt;a href="https://github.com/gradientinstitute/causal-inspection"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;div style="clear: both;"&gt;&lt;/div&gt;</content><category term="Projects"></category></entry><entry><title>Algorithmic Fairness — Fair Regression Algorithms</title><link href="https://dsteinberg.github.io/algorithmic-fairness.html" rel="alternate"></link><published>2020-01-01T00:00:00+11:00</published><updated>2020-01-01T00:00:00+11:00</updated><author><name>Dan Steinberg</name></author><id>tag:dsteinberg.github.io,2020-01-01:/algorithmic-fairness.html</id><summary type="html">&lt;h3&gt;&lt;strong&gt;Algorithmic Fairness&lt;/strong&gt; - Fair Regression&amp;nbsp;Algorithms&lt;/h3&gt;
&lt;div class="col-sm-5 col-md-5" style="margin-left: -15px;"&gt;
&lt;div class="panel panel-default"&gt;
    &lt;div class="panel-body"&gt;
        &lt;img src="https://dsteinberg.github.io/images/variance.png" alt="Unfairness toy example"&gt;
    &lt;/div&gt;
    &lt;div class="panel-footer"&gt;
        A simulated dataset depicting an unfair prediction under the
        &amp;#8220;separation&amp;#8221; and &amp;#8220;sufficiency&amp;#8221; fairness criteria.
    &lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Algorithmic fairness involves expressing notions such as equity, equality, or
reasonable treatment, as quantifiable measures that a machine learning
algorithm can optimise. Mathematising these concepts, so they can be inferred
from …&lt;/p&gt;</summary><content type="html">&lt;h3&gt;&lt;strong&gt;Algorithmic Fairness&lt;/strong&gt; - Fair Regression&amp;nbsp;Algorithms&lt;/h3&gt;
&lt;div class="col-sm-5 col-md-5" style="margin-left: -15px;"&gt;
&lt;div class="panel panel-default"&gt;
    &lt;div class="panel-body"&gt;
        &lt;img src="https://dsteinberg.github.io/images/variance.png" alt="Unfairness toy example"&gt;
    &lt;/div&gt;
    &lt;div class="panel-footer"&gt;
        A simulated dataset depicting an unfair prediction under the
        &amp;#8220;separation&amp;#8221; and &amp;#8220;sufficiency&amp;#8221; fairness criteria.
    &lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Algorithmic fairness involves expressing notions such as equity, equality, or
reasonable treatment, as quantifiable measures that a machine learning
algorithm can optimise. Mathematising these concepts, so they can be inferred
from data is challenging, as is deciding on the balance between fairness and
other objectives such as accuracy in a particular application. My research in
this area along with others at the &lt;a href="https://gradientinstitute.org"&gt;Gradient
Institute&lt;/a&gt; has thus far focused on regression
algorithms. Measuring the fairness of a regression algorithm is difficult
compared to the classification case for many popular fairness criteria.
Similarly, adjusting the predictions of a regressor is more complex than doing
so for a classifier, and so our research has been targeting these areas. Here
you can read more about &lt;a href="https://arxiv.org/abs/2001.06089"&gt;measurement&lt;/a&gt;, and
&lt;a href="https://arxiv.org/abs/2002.06200"&gt;adjusting&lt;/a&gt; regression&amp;nbsp;algorithms.&lt;/p&gt;
&lt;div style="clear: both;"&gt;&lt;/div&gt;</content><category term="Projects"></category></entry><entry><title>Landshark — Large-Scale Spatial Inference with TensorFlow</title><link href="https://dsteinberg.github.io/landshark.html" rel="alternate"></link><published>2018-01-01T00:00:00+11:00</published><updated>2018-01-01T00:00:00+11:00</updated><author><name>Dan Steinberg</name></author><id>tag:dsteinberg.github.io,2018-01-01:/landshark.html</id><summary type="html">&lt;h3&gt;&lt;strong&gt;Landshark&lt;/strong&gt; - Large-scale Spatial Inference with&amp;nbsp;Tensorflow&lt;/h3&gt;
&lt;div class="col-sm-8 col-md-8" style="margin-left: -15px;"&gt;
&lt;div class="panel panel-default"&gt;
    &lt;div class="panel-body"&gt;
        &lt;img src="https://dsteinberg.github.io/images/landshark.jpg" alt="Predictive entropy"&gt;
    &lt;/div&gt;
    &lt;div class="panel-footer"&gt;
        The predictive entropy (uncertainty) of the concentration of an element
        in soils in Western Australia.
    &lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Landshark is a set of python command line tools that for supervised learning
problems on large spatial raster datasets. It solves problems in which the user
has a set …&lt;/p&gt;</summary><content type="html">&lt;h3&gt;&lt;strong&gt;Landshark&lt;/strong&gt; - Large-scale Spatial Inference with&amp;nbsp;Tensorflow&lt;/h3&gt;
&lt;div class="col-sm-8 col-md-8" style="margin-left: -15px;"&gt;
&lt;div class="panel panel-default"&gt;
    &lt;div class="panel-body"&gt;
        &lt;img src="https://dsteinberg.github.io/images/landshark.jpg" alt="Predictive entropy"&gt;
    &lt;/div&gt;
    &lt;div class="panel-footer"&gt;
        The predictive entropy (uncertainty) of the concentration of an element
        in soils in Western Australia.
    &lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Landshark is a set of python command line tools that for supervised learning
problems on large spatial raster datasets. It solves problems in which the user
has a set of target point measurements, such as geochemistry, soil
classification, or depth to basement, and wants to relate those to a number of
raster covariates, like satellite imagery or geophysics, to predict the targets
on the raster&amp;nbsp;grid.&lt;/p&gt;
&lt;p&gt;Landshark fills a particular niche: where we want to efficiently learn models
with very large numbers of training points and/or very large covariate images
using TensorFlow. Landshark is particularly useful for the case when the
training data itself will not fit in memory, and must be streamed to a
minibatch stochastic gradient descent algorithm for model&amp;nbsp;learning.&lt;/p&gt;
&lt;p&gt;Please see the &lt;a href="https://github.com/data61/landshark"&gt;Landshark project page&lt;/a&gt;
for more&amp;nbsp;information.&lt;/p&gt;
&lt;div style="clear: both;"&gt;&lt;/div&gt;</content><category term="Projects"></category></entry><entry><title>Aboleth — A TensorFlow Framework for Bayesian Deep Learning</title><link href="https://dsteinberg.github.io/aboleth.html" rel="alternate"></link><published>2017-01-01T00:00:00+11:00</published><updated>2017-01-01T00:00:00+11:00</updated><author><name>Dan Steinberg</name></author><id>tag:dsteinberg.github.io,2017-01-01:/aboleth.html</id><summary type="html">&lt;h3&gt;&lt;strong&gt;Aboleth&lt;/strong&gt; - A TensorFlow Framework for Bayesian Deep&amp;nbsp;Learning&lt;/h3&gt;
&lt;div class="col-sm-7 col-md-7" style="margin-left: -15px;"&gt;
&lt;div class="panel panel-default"&gt;
    &lt;div class="panel-body"&gt;
        &lt;img src="https://dsteinberg.github.io/images/bayesnn.png" alt="A Bayesian Neural Net"&gt;
    &lt;/div&gt;
    &lt;div class="panel-footer"&gt;
        Depiction of a Bayesian Neural Net that is easily constructed using
        Aboleth.
    &lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;I am one of the primary creators of Aboleth, a bare-bones TensorFlow framework
for Bayesian deep learning and Gaussian process&amp;nbsp;approximation.&lt;/p&gt;
&lt;p&gt;The purpose of Aboleth is to provide a set …&lt;/p&gt;</summary><content type="html">&lt;h3&gt;&lt;strong&gt;Aboleth&lt;/strong&gt; - A TensorFlow Framework for Bayesian Deep&amp;nbsp;Learning&lt;/h3&gt;
&lt;div class="col-sm-7 col-md-7" style="margin-left: -15px;"&gt;
&lt;div class="panel panel-default"&gt;
    &lt;div class="panel-body"&gt;
        &lt;img src="https://dsteinberg.github.io/images/bayesnn.png" alt="A Bayesian Neural Net"&gt;
    &lt;/div&gt;
    &lt;div class="panel-footer"&gt;
        Depiction of a Bayesian Neural Net that is easily constructed using
        Aboleth.
    &lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;I am one of the primary creators of Aboleth, a bare-bones TensorFlow framework
for Bayesian deep learning and Gaussian process&amp;nbsp;approximation.&lt;/p&gt;
&lt;p&gt;The purpose of Aboleth is to provide a set of high performance and light weight
components for building Bayesian neural nets and approximate (deep) Gaussian
process computational graphs. We aim for minimal abstraction over pure
TensorFlow, so you can still assign parts of the computational graph to
different hardware, use your own data feeds/queues, and manage your own
sessions&amp;nbsp;etc.&lt;/p&gt;
&lt;p&gt;The project page is on &lt;a href="https://github.com/data61/aboleth"&gt;github&lt;/a&gt;.&lt;/p&gt;
&lt;div style="clear: both;"&gt;&lt;/div&gt;</content><category term="Projects"></category></entry><entry><title>Revrand — Scalable Bayesian Generalised Linear Models</title><link href="https://dsteinberg.github.io/revrand.html" rel="alternate"></link><published>2016-01-01T00:00:00+11:00</published><updated>2016-01-01T00:00:00+11:00</updated><author><name>Dan Steinberg</name></author><id>tag:dsteinberg.github.io,2016-01-01:/revrand.html</id><summary type="html">&lt;h3&gt;&lt;strong&gt;Revrand&lt;/strong&gt; - Scalable Bayesian Generalised Linear&amp;nbsp;Models&lt;/h3&gt;
&lt;div&gt;
    &lt;div class="col-sm-5 col-md-4" style="margin-left: -15px;"&gt;
    &lt;div class="panel panel-default"&gt;
        &lt;div class="panel-body"&gt;
            &lt;img src="https://github.com/NICTA/revrand/raw/master/docs/glm_sgd_demo.png"
                 alt="Regression and GLM Demo" /&gt;
        &lt;/div&gt;
        &lt;div class="panel-footer"&gt;
            revrand uses recent advances in large scale kernel methods to
            approximate kernel machines, such as Gaussian processes, with
            linear models. Using this technology we can harness the inferential
            power of kernel machines while exploiting the scalability of linear
            models.
        &lt;/div&gt;
    &lt;/div&gt;
    &lt;/div&gt;
    &lt;div class="col-sm-5 col-md-4" style="margin-left: -15px;"&gt;
    &lt;div class="panel panel-default"&gt;
        &lt;div class="panel-body"&gt;
            &lt;img src="https://github.com/NICTA/revrand/raw/master/docs/glm_demo.png"
             alt="GLM Demo" /&gt;
        &lt;/div&gt;
        &lt;div class="panel-footer"&gt;
            revrand also uses recent advances …&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;</summary><content type="html">&lt;h3&gt;&lt;strong&gt;Revrand&lt;/strong&gt; - Scalable Bayesian Generalised Linear&amp;nbsp;Models&lt;/h3&gt;
&lt;div&gt;
    &lt;div class="col-sm-5 col-md-4" style="margin-left: -15px;"&gt;
    &lt;div class="panel panel-default"&gt;
        &lt;div class="panel-body"&gt;
            &lt;img src="https://github.com/NICTA/revrand/raw/master/docs/glm_sgd_demo.png"
                 alt="Regression and GLM Demo" /&gt;
        &lt;/div&gt;
        &lt;div class="panel-footer"&gt;
            revrand uses recent advances in large scale kernel methods to
            approximate kernel machines, such as Gaussian processes, with
            linear models. Using this technology we can harness the inferential
            power of kernel machines while exploiting the scalability of linear
            models.
        &lt;/div&gt;
    &lt;/div&gt;
    &lt;/div&gt;
    &lt;div class="col-sm-5 col-md-4" style="margin-left: -15px;"&gt;
    &lt;div class="panel panel-default"&gt;
        &lt;div class="panel-body"&gt;
            &lt;img src="https://github.com/NICTA/revrand/raw/master/docs/glm_demo.png"
             alt="GLM Demo" /&gt;
        &lt;/div&gt;
        &lt;div class="panel-footer"&gt;
            revrand also uses recent advances in variational inference to
            accurately approximate fully Bayesian posteriors for non-conjugate
            models, such as generalised linear models. In this way it can
            provide comprehensive measures of uncertainty in its predictions.
        &lt;/div&gt;
    &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;I am the project creator and primary contributor to &lt;em&gt;revrand&lt;/em&gt;, a software
library implements Bayesian linear models (Bayesian linear regression) and
generalised linear models. A few features of this library&amp;nbsp;are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A basis functions/feature composition framework for combining basis functions
  like radial basis functions, sigmoidal basis functions, polynomial basis
  functions&amp;nbsp;etc.&lt;/li&gt;
&lt;li&gt;Basis functions that can be used to approximate Gaussian processes with shift
  invariant covariance functions (e.g. square exponential) when used with
  linear&amp;nbsp;models.&lt;/li&gt;
&lt;li&gt;Non-Gaussian likelihoods with Bayesian generalised linear models using a
  modified version of the nonparametric variational inference algorithm with
  large scale learning using stochastic gradients (&lt;span class="caps"&gt;ADADELTA&lt;/span&gt;, Adam and&amp;nbsp;others).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The project page is on &lt;a href="https://github.com/NICTA/revrand"&gt;github&lt;/a&gt;.&lt;/p&gt;
&lt;div style="clear: both;"&gt;&lt;/div&gt;</content><category term="Projects"></category></entry><entry><title>The Impact of Computerisation and Automation on Future Employment</title><link href="https://dsteinberg.github.io/job-automation.html" rel="alternate"></link><published>2015-06-01T00:00:00+10:00</published><updated>2015-06-01T00:00:00+10:00</updated><author><name>Dan Steinberg</name></author><id>tag:dsteinberg.github.io,2015-06-01:/job-automation.html</id><summary type="html">&lt;h3&gt;The impact of computerisation and automation on future&amp;nbsp;employment&lt;/h3&gt;
&lt;div class="col-sm-6 col-md-6" style="margin-left: -15px;"&gt;
&lt;div class="panel panel-default"&gt;
    &lt;div class="panel-body"&gt;
        &lt;img src="https://dsteinberg.github.io/images/job-automation.jpg" alt="LGA probability of job loss."&gt;
    &lt;/div&gt;
    &lt;div class="panel-footer"&gt;
        Weighted probability of job loss through computerisation and automation
        in local government areas of Australia.
    &lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;This work is a qualitative study into the susceptibility of jobs in Australia
to computerisation and automation over the next 10 to 15 years. The methodology …&lt;/p&gt;</summary><content type="html">&lt;h3&gt;The impact of computerisation and automation on future&amp;nbsp;employment&lt;/h3&gt;
&lt;div class="col-sm-6 col-md-6" style="margin-left: -15px;"&gt;
&lt;div class="panel panel-default"&gt;
    &lt;div class="panel-body"&gt;
        &lt;img src="https://dsteinberg.github.io/images/job-automation.jpg" alt="LGA probability of job loss."&gt;
    &lt;/div&gt;
    &lt;div class="panel-footer"&gt;
        Weighted probability of job loss through computerisation and automation
        in local government areas of Australia.
    &lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;This work is a qualitative study into the susceptibility of jobs in Australia
to computerisation and automation over the next 10 to 15 years. The methodology
and initial data used is based on the much-cited paper by Frey and Osborne,
which studied this same problem for the United States (&lt;span class="caps"&gt;US&lt;/span&gt;) and, more recently,
for the United Kingdom (&lt;span class="caps"&gt;UK&lt;/span&gt;). The key to this work is trying to understand and
quantify the impact of emerging technology on jobs and employment in areas such
as artificial intelligence, robotics and machine&amp;nbsp;learning.&lt;/p&gt;
&lt;p&gt;The results show that 40 per cent of jobs in Australia have a high probability
of being susceptible to computerisation and automation in the next 10 to 15
years. Jobs in administration and some services are particularly susceptible,
as are regions that have historically associated with the mining industry. Jobs
in the professions, in technical and creative industries, and in personal
service areas (health for example) are least susceptible to automation. The
report can be found &lt;a href="http://adminpanel.ceda.com.au/FOLDERS/Service/Files/Documents/26792~Futureworkforce_June2015.pdf"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;div style="clear: both;"&gt;&lt;/div&gt;</content><category term="Projects"></category></entry><entry><title>Extended and Unscented Kitchen Sinks</title><link href="https://dsteinberg.github.io/extended-unscented-kitchen-sinks.html" rel="alternate"></link><published>2015-01-01T00:00:00+11:00</published><updated>2015-01-01T00:00:00+11:00</updated><author><name>Dan Steinberg</name></author><id>tag:dsteinberg.github.io,2015-01-01:/extended-unscented-kitchen-sinks.html</id><summary type="html">&lt;h3&gt;Extended and Unscented Kitchen&amp;nbsp;Sinks&lt;/h3&gt;
&lt;div class="col-sm-8 col-md-8" style="margin-left: -15px;"&gt;
&lt;div class="panel panel-default"&gt;
    &lt;div class="panel-body"&gt;
        &lt;img src="https://dsteinberg.github.io/images/EKS-seismic.png" alt="EKS seismic inversion"&gt;
    &lt;/div&gt;
    &lt;div class="panel-footer"&gt;
        Example results of the extended kitchen sinks (&lt;span class="caps"&gt;EKS&lt;/span&gt;) algorithm on an
        interpreted seismic inversion problem, where we wish to infer the below
        ground structure of the Earth from sound wave reflection times. The
        inferred rock-type layer boundaries (left) and seismic velocities
        (right) are shown in …&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;</summary><content type="html">&lt;h3&gt;Extended and Unscented Kitchen&amp;nbsp;Sinks&lt;/h3&gt;
&lt;div class="col-sm-8 col-md-8" style="margin-left: -15px;"&gt;
&lt;div class="panel panel-default"&gt;
    &lt;div class="panel-body"&gt;
        &lt;img src="https://dsteinberg.github.io/images/EKS-seismic.png" alt="EKS seismic inversion"&gt;
    &lt;/div&gt;
    &lt;div class="panel-footer"&gt;
        Example results of the extended kitchen sinks (&lt;span class="caps"&gt;EKS&lt;/span&gt;) algorithm on an
        interpreted seismic inversion problem, where we wish to infer the below
        ground structure of the Earth from sound wave reflection times. The
        inferred rock-type layer boundaries (left) and seismic velocities
        (right) are shown in blue, indicating the predictive means and standard
        deviation envelopes. Draws from the &lt;span class="caps"&gt;MCMC&lt;/span&gt; inversion are overlaid in
        dotted black.
    &lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;In this work we extended our Bayesian nonparametric algorithms for inverse
problems, the unscented and extended Gaussian processes, to work with multiple
outputs and over large datasets. The new algorithms are called unscented and
extended kitchen sinks (&lt;span class="caps"&gt;EKS&lt;/span&gt; and &lt;span class="caps"&gt;UKS&lt;/span&gt;) since they use the random kitchen sink (or
basis function) approximation for scaling kernel machines. This approximation
allows us to straightforwardly enable the &lt;span class="caps"&gt;EKS&lt;/span&gt; and &lt;span class="caps"&gt;UKS&lt;/span&gt; to work in multiple
output scenarios as well, enabling these algorithms to be useful for a wide
variety of complex nonlinear inversion problems, such as geophysical&amp;nbsp;inversions.&lt;/p&gt;
&lt;div style="clear: both;"&gt;&lt;/div&gt;</content><category term="Projects"></category></entry><entry><title>Nonparametric Bayesian Inverse Problems</title><link href="https://dsteinberg.github.io/bayesian-inverse-problems.html" rel="alternate"></link><published>2014-01-01T00:00:00+11:00</published><updated>2014-01-01T00:00:00+11:00</updated><author><name>Dan Steinberg</name></author><id>tag:dsteinberg.github.io,2014-01-01:/bayesian-inverse-problems.html</id><summary type="html">&lt;h3&gt;Nonparametric Bayesian Inverse&amp;nbsp;Problems&lt;/h3&gt;
&lt;div class="col-sm-6 col-md-6" style="margin-left: -15px;"&gt;
&lt;div class="panel panel-default"&gt;
    &lt;div class="panel-body"&gt;
        &lt;img src="https://dsteinberg.github.io/images/ugp-sign.png" alt="UGP signum test"&gt;
    &lt;/div&gt;
    &lt;div class="panel-footer"&gt;
        Example of learning the unscented Gaussian process (&lt;span class="caps"&gt;UGP&lt;/span&gt;) with a
        non-differentiable nonlinear function (forward model) in the likelihood
        - a polynomial with one term in a signum function. Here only the black
        dots are seen by the algorithm, and the nonlinear function transforming
        the blue line to …&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;</summary><content type="html">&lt;h3&gt;Nonparametric Bayesian Inverse&amp;nbsp;Problems&lt;/h3&gt;
&lt;div class="col-sm-6 col-md-6" style="margin-left: -15px;"&gt;
&lt;div class="panel panel-default"&gt;
    &lt;div class="panel-body"&gt;
        &lt;img src="https://dsteinberg.github.io/images/ugp-sign.png" alt="UGP signum test"&gt;
    &lt;/div&gt;
    &lt;div class="panel-footer"&gt;
        Example of learning the unscented Gaussian process (&lt;span class="caps"&gt;UGP&lt;/span&gt;) with a
        non-differentiable nonlinear function (forward model) in the likelihood
        - a polynomial with one term in a signum function. Here only the black
        dots are seen by the algorithm, and the nonlinear function transforming
        the blue line to the green is known, but not its inverse. The aim is to
        estimate the latent function (blue line) from the black dots, without
        knowing the inverse function. In this figure we show the predictive
        distributions of the latent function (red dashed line and standard
        deviation bounds) and of the observations (green line and standard
        deviation bounds).
    &lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Nonlinear inversion problems, where we wish to infer the latent inputs to a
system given observations of its output and the system&amp;#8217;s forward-model, have a
long history in the natural sciences, dynamical modeling and estimation. An
example is the robot-arm inverse kinematics problem, where we wish to infer how
to drive the robot&amp;#8217;s joints (i.e. joint torques) in order to place the
end-effector in a particular position, given we can measure its position and
know the forward kinematics of the arm. Most of the existing algorithms either
estimate the system inputs at a particular point in time like the
Levenberg-Marquardt algorithm, or in a recursive manner such as the extended
and unscented Kalman filters (&lt;span class="caps"&gt;EKF&lt;/span&gt;, &lt;span class="caps"&gt;UKF&lt;/span&gt;). In many inversion problems we have a
continuous process; a smooth trajectory of a robot arm for example.
Non-parametric regression techniques like Gaussian processes seem applicable,
and have been used in linear inversion&amp;nbsp;problems.&lt;/p&gt;
&lt;p&gt;In this work we present two new methods for inference in Gaussian process (&lt;span class="caps"&gt;GP&lt;/span&gt;)
models with general nonlinear likelihoods. Inference is based on a variational
framework where a Gaussian posterior is assumed and the likelihood is
linearized about the variational posterior mean using either a Taylor series
expansion or statistical linearization. We show that the parameter updates
obtained by these algorithms are equivalent to the state update equations in
the iterative extended and unscented Kalman filters respectively, hence we
refer to our algorithms as extended and unscented GPs. The unscented &lt;span class="caps"&gt;GP&lt;/span&gt; treats
the likelihood as a &amp;#8216;black-box&amp;#8217; by not requiring its derivative for inference,
so it also applies to non-differentiable likelihood models. We evaluate the
performance of our algorithms on a number of synthetic inversion problems and a
binary classification dataset. See our &lt;a href="http://papers.nips.cc/paper/5455-extended-and-unscented-gaussian-processes"&gt;&lt;span class="caps"&gt;NIPS&lt;/span&gt; &lt;em&gt;spotlight&lt;/em&gt;
paper&lt;/a&gt;
for more&amp;nbsp;details.&lt;/p&gt;
&lt;div style="clear: both"&gt;&lt;/div&gt;</content><category term="Projects"></category></entry><entry><title>Unsupervised Scene “Understanding”</title><link href="https://dsteinberg.github.io/scene-understanding.html" rel="alternate"></link><published>2013-01-01T00:00:00+11:00</published><updated>2013-01-01T00:00:00+11:00</updated><author><name>Dan Steinberg</name></author><id>tag:dsteinberg.github.io,2013-01-01:/scene-understanding.html</id><summary type="html">&lt;h3&gt;Unsupervised Scene&amp;nbsp;&amp;#8220;Understanding&amp;#8221;&lt;/h3&gt;
&lt;div&gt;
    &lt;div class="col-sm-4 col-md-3" style="margin-left: -15px;"&gt;
    &lt;div class="panel panel-default"&gt;
        &lt;div class="panel-body"&gt;
            &lt;img src="https://dsteinberg.github.io/images/MSRC_im_ex.jpg" alt="MSRC clusters" /&gt;
        &lt;/div&gt;
        &lt;div class="panel-footer"&gt;
            Sample images belonging to image clusters found by an algorithm
            that can use both whole-image features and distributions of objects
            to describe images. The image clusters are shown row-wise.
        &lt;/div&gt;
    &lt;/div&gt;
    &lt;/div&gt;
    &lt;div class="col-sm-4 col-md-3" style="margin-left: -15px; margin-right: 0px;"&gt;
    &lt;div class="panel panel-default"&gt;
        &lt;div class="panel-body"&gt;
            &lt;img src="https://dsteinberg.github.io/images/MSRC_seg_ex.jpg" alt="MSRC segments" /&gt;
        &lt;/div&gt;
        &lt;div class="panel-footer"&gt;
            The corresponding learned segment clusters to the images in the
            previous figure. The composition and proportions of these …&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;</summary><content type="html">&lt;h3&gt;Unsupervised Scene&amp;nbsp;&amp;#8220;Understanding&amp;#8221;&lt;/h3&gt;
&lt;div&gt;
    &lt;div class="col-sm-4 col-md-3" style="margin-left: -15px;"&gt;
    &lt;div class="panel panel-default"&gt;
        &lt;div class="panel-body"&gt;
            &lt;img src="https://dsteinberg.github.io/images/MSRC_im_ex.jpg" alt="MSRC clusters" /&gt;
        &lt;/div&gt;
        &lt;div class="panel-footer"&gt;
            Sample images belonging to image clusters found by an algorithm
            that can use both whole-image features and distributions of objects
            to describe images. The image clusters are shown row-wise.
        &lt;/div&gt;
    &lt;/div&gt;
    &lt;/div&gt;
    &lt;div class="col-sm-4 col-md-3" style="margin-left: -15px; margin-right: 0px;"&gt;
    &lt;div class="panel panel-default"&gt;
        &lt;div class="panel-body"&gt;
            &lt;img src="https://dsteinberg.github.io/images/MSRC_seg_ex.jpg" alt="MSRC segments" /&gt;
        &lt;/div&gt;
        &lt;div class="panel-footer"&gt;
            The corresponding learned segment clusters to the images in the
            previous figure. The composition and proportions of these segment
            clusters (coloured regions) are fairly consistent within an image
            cluster.
        &lt;/div&gt;
    &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;For very large scientific datasets with many image classes and objects,
producing the ground-truth data for supervised (trained) algorithms can
represent a substantial, and potentially expensive, human effort. In these
situations there is scope for the use of unsupervised approaches, such as
clustering, which can model collections of images and automatically summarise
their content without human&amp;nbsp;training.&lt;/p&gt;
&lt;p&gt;To explore how modelling context effects clustering results, I derived several
new algorithms that simultaneously cluster images and segments (super-pixels)
within images. These algorithms also model collections of photos such as photo
albums. Images are defined by whole-scene descriptors &lt;em&gt;and&lt;/em&gt; the distribution of
&amp;#8220;objects&amp;#8221; (segment clusters) within them. The images and segments are clustered
using this joint representation, which is also more interpretable by people.
The intuition behind this approach is that by knowing something about the type
of scene (image cluster), object detection (segment clustering) can be
improved. That is, we are likely to find trees in a forest. Additionally, by
knowing about the distribution and co-occurrence of objects in an image, we
have a better idea of the type of scene (cows and grass most likely make a
rural&amp;nbsp;scene).&lt;/p&gt;
&lt;p&gt;These algorithms for unsupervised scene understanding outperform other
unsupervised algorithms for segment and scene clustering. This is because of
how they model context. These algorithms were even found to be competitive with
state of the art supervised and semi-supervised approaches to scene
understanding, as well as being scalable to larger datasets. See
my &lt;a href="https://dsteinberg.github.io/docs/Steinberg_ICCV2013_MCM.pdf"&gt;&lt;span class="caps"&gt;ICCV&lt;/span&gt; paper&lt;/a&gt;,
&lt;a href="https://dsteinberg.github.io/docs/CVIU_scene.pdf"&gt;&lt;span class="caps"&gt;CVIU&lt;/span&gt; article&lt;/a&gt; and my &lt;a href="https://dsteinberg.github.io/docs/Thesis.pdf"&gt;thesis (ch. 5 &lt;span class="amp"&gt;&amp;amp;&lt;/span&gt;
6)&lt;/a&gt; for more&amp;nbsp;information.&lt;/p&gt;
&lt;div style="clear: both"&gt;&lt;/div&gt;</content><category term="Projects"></category></entry><entry><title>Clustering Images Over Many Datasets</title><link href="https://dsteinberg.github.io/clustering-images-albums.html" rel="alternate"></link><published>2012-01-01T00:00:00+11:00</published><updated>2012-01-01T00:00:00+11:00</updated><author><name>Dan Steinberg</name></author><id>tag:dsteinberg.github.io,2012-01-01:/clustering-images-albums.html</id><summary type="html">&lt;h3&gt;Clustering Images Over Many&amp;nbsp;Datasets&lt;/h3&gt;
&lt;p&gt;Large image collections are frequently partitioned into distinct but related
groups, such as photo albums from distinct environments that contain similar
scenes. For example, a hiking holiday album may contain many images of forests
and maybe a few villages. Whereas a conference trip album may …&lt;/p&gt;</summary><content type="html">&lt;h3&gt;Clustering Images Over Many&amp;nbsp;Datasets&lt;/h3&gt;
&lt;p&gt;Large image collections are frequently partitioned into distinct but related
groups, such as photo albums from distinct environments that contain similar
scenes. For example, a hiking holiday album may contain many images of forests
and maybe a few villages. Whereas a conference trip album may have many urban
scenes and images of people, with perhaps a few images of park-land. These
groups, or albums, may be thought of as providing context for the images they&amp;nbsp;contain.&lt;/p&gt;
&lt;p&gt;I have formulated and applied a latent Dirichlet allocation-like algorithm to
this problem. It shares image clusters between groups or albums, and keeps the
proportion of clusters (mixture-weights) specific to each group, thereby
modelling the context of the group. By doing this, the algorithm is actually
better at finding clusters, and is often faster when dealing with large
datasets, than regular mixture model based approaches. See my
&lt;a href="https://dsteinberg.github.io/docs/Thesis.pdf"&gt;thesis (ch.4)&lt;/a&gt; for more&amp;nbsp;information.&lt;/p&gt;
&lt;div class="panel panel-default"&gt;
    &lt;div class="panel-body"&gt;
        &lt;img src="https://dsteinberg.github.io/images/halbums_gmc_sub.jpg" alt="Album clusters"&gt;
    &lt;/div&gt;
    &lt;div class="panel-footer"&gt;
        Here 10,300 images from 12 holiday photo albums are clustered. Shown
        are the most and least &amp;#8220;likely&amp;#8221; images from seven clusters (out of 23).
        Also shown are the most frequent five tags from Flickr associated with
        the clusters. The algorithms that could model these photo albums found
        more self-consistent clusters than the algorithms that count not, such
        as regular mixture models. This took less than a minute to run. Again,
        these are entirely unsupervised algorithms
    &lt;/div&gt;
&lt;/div&gt;</content><category term="Projects"></category></entry><entry><title>Clustering Images of the Seafloor</title><link href="https://dsteinberg.github.io/clustering-seafloor.html" rel="alternate"></link><published>2011-01-01T00:00:00+11:00</published><updated>2011-01-01T00:00:00+11:00</updated><author><name>Dan Steinberg</name></author><id>tag:dsteinberg.github.io,2011-01-01:/clustering-seafloor.html</id><summary type="html">&lt;h3&gt;Clustering Images of the&amp;nbsp;Seafloor&lt;/h3&gt;
&lt;p&gt;I have applied a Bayesian non-parametric algorithm, the variational Dirichlet
process (with Gaussian clusters), to clustering large quantities of seafloor
imagery (obtained from an autonomous underwater vehicle or &lt;span class="caps"&gt;AUV&lt;/span&gt;) in an
unsupervised manner. The algorithm has the attractive property that it does not
require knowledge …&lt;/p&gt;</summary><content type="html">&lt;h3&gt;Clustering Images of the&amp;nbsp;Seafloor&lt;/h3&gt;
&lt;p&gt;I have applied a Bayesian non-parametric algorithm, the variational Dirichlet
process (with Gaussian clusters), to clustering large quantities of seafloor
imagery (obtained from an autonomous underwater vehicle or &lt;span class="caps"&gt;AUV&lt;/span&gt;) in an
unsupervised manner. The algorithm has the attractive property that it does not
require knowledge of the number of clusters to be specified, which enables
truly autonomous sensor data abstraction. The underlying image representation
uses descriptors for colour, texture and 3D structure that are obtained from
stereo cameras. This approach consistently produces easily recognisable
clusters that approximately correspond to different habitat types.  These
clusters are useful in observing spatial patterns, focusing expert analysis on
subsets of seafloor imagery, aiding mission planning, and potentially informing
real time adaptive sampling. See my
&lt;a href="http://www.isrr-2011.org/ISRR-2011//Program_files/Papers/Williams-ISRR-2011.pdf"&gt;&lt;span class="caps"&gt;ISRR&lt;/span&gt; paper&lt;/a&gt;
for more&amp;nbsp;details.&lt;/p&gt;
&lt;div class="row"&gt;
    &lt;div class="col-sm-4"&gt;
        &lt;div class="panel panel-default"&gt;
        &lt;div class="panel-body"&gt;
            &lt;img src="https://dsteinberg.github.io/images/scott25_img.jpg" alt="Cluster example"&gt;
        &lt;/div&gt;
        &lt;div class="panel-footer"&gt;
            An example of images from an &lt;span class="caps"&gt;AUV&lt;/span&gt; survey that have been clustered.
            This survey has 10,000 images within it. Some sample images
            belonging to each of the 6 clusters found by the algorithm are
            shown row-wise. The algorithm used only took a few seconds to
            obtain these results, and needed no human generated training data.
            That is, the algorithm found these image clusters with no human
            input.&lt;br&gt;
        &lt;/div&gt;
        &lt;/div&gt;
    &lt;/div&gt;
    &lt;div class="col-sm-4"&gt;
        &lt;div class="panel panel-default"&gt;
        &lt;div class="panel-body"&gt;
            &lt;img src="https://dsteinberg.github.io/images/scott25_mos.jpg" alt="Survey"&gt;
        &lt;/div&gt;
        &lt;div class="panel-footer"&gt;
            Top-down mosaic of the survey.
        &lt;/div&gt;
        &lt;/div&gt;
    &lt;/div&gt;
    &lt;div class="col-sm-4"&gt;
        &lt;div class="panel panel-default"&gt;
        &lt;div class="panel-body"&gt;
            &lt;img src="https://dsteinberg.github.io/images/scott25_xy.png" alt="Survey labels"&gt;
        &lt;/div&gt;
        &lt;div class="panel-footer"&gt;
            Top-down view image locations coloured by image cluster labels.
        &lt;/div&gt;
        &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;</content><category term="Projects"></category></entry></feed>