Mathematical foundations of sampling connected balanced graph partitions

4. Data Science

These questions are more open-ended, and motivated by a broad desire to understand and analyze graphs and partitions occurring in real world data.

#12 Understanding the space of maps

Add remark

Problem 4.1.
[Ranthony Clark] How do we know how far apart two redistricting maps are? Can you capture the number of steps it takes in a Markov chain or political characteristics? What does the topology of the space of maps look like under such a metric? What is the interplay between the target measure for MCMC and the properties of these metrics?
#13 Computational tools for exploring properties of a distribution over maps

Add remark

Problem 4.2.
[Jonathan Mattingly] Given a target measure, are there simple/computationally feasible tools to study the topology and metric properties of the space of graph partitions?
#14 Partition similarity via spanning forests

Add remark

Problem 4.3.
[Gyaneshwar Agrahari] Can the collection of spanning forests on a pair of graph partitions be used to make define a notion of distance between them or characterize their similarity?
#15 Inference from adjacency data

Add remark

Problem 4.4.
[Ranthony Clark] What properties of the geography or political geography can be read off the graph adjacency data alone?
#16 Computational tools to explain collections of real maps

Add remark

Problem 4.5.
[Gregory Hershlag, Jonathan Mattingly] Given a collection of redistricting maps, how can you find a distribution or family of distributions explaining these maps?

(a) Can we find a good regularization to make the problem better posed? For example, suppose we look at all maps that have ever been proposed and we build some reasonable parameterized distribution explaining them. Now look at 5 maps for a specific case. Can we design the parameterization to make this inference problem easy and not under-defined?
#17 Models of graphs arising in real geographic data

Add remark

Problem 4.6.
[Sarah Cannon, Wesley Pegden] -

(a) What is a good generative model?

(b) Perturbed Delaunay triangulations of random point sets seem to work very well. You add edges with preferential attachment then delete edges randomly. Can we formally investigate how/why this works? What about Voronoi graphs or 2-Delaunay graphs?

(c) What about taking a Delaunay triangulation of a non-uniformly random point set?
#18 Beyond redistricting: processor assignment application

Add remark

Problem 4.7.
[Daryl DeFord] Is there a many-body problem where these kinds of partitioning algorithms can give a speedup when used for processor assignment? This is a dynamic problem where processors can only talk locally to each other and you are allowed to reassign.

Cite this as: AimPL: Mathematical foundations of sampling connected balanced graph partitions, available at http://aimpl.org/connectedbalanced.

Mathematical foundations of sampling connected balanced graph partitions

4. Data Science

#12 Understanding the space of maps

Problem 4.1.

#13 Computational tools for exploring properties of a distribution over maps

Problem 4.2.

#14 Partition similarity via spanning forests

Problem 4.3.

#15 Inference from adjacency data

Problem 4.4.

#16 Computational tools to explain collections of real maps

Problem 4.5.

#17 Models of graphs arising in real geographic data

Problem 4.6.

#18 Beyond redistricting: processor assignment application

Problem 4.7.