Why Does Theory Matter in Computer Science? /

Why Does Theory Matter in Computer Science? (Part 5)

References, Resources, and Further Reading

This is the fifth installment in the written version of my talk about why I think theory matters in computer science. You can find part 1 here, part 2 here, part 3 here, part 4 here, and the entire series is archived here.

Here is a long list of sources I consulted at various points while doing research for this project. As it might have become clear through reading the previous installments of this series, this was more a talk about the densest subgraph problem than it was about theory being useful in computer science. There are a few reasons for that, which I might explain in more detail at some point in the future¹. This means that I presented a lot of information that I borrowed from other people! Here’s an annotated bibliography of sorts, in case you were interested in going deeper, for whatever reason.

Key Sources I Consulted

Chandra Chekuri’s invited talk at the Tutte Seminar, University of Waterloo C&O department
Link: https://www.youtube.com/watch?v=DO8iNCwmp0c

In this talk, Chekuri talks about the paper he coauthored that had to do with iterative peeling. I found it very helpful to watch this video (very slowly) while trying to decipher the paper².

Jeff A. Bilmes: Deep mathematical properties of submodularity with applications to machine learning
Link: https://www.youtube.com/watch?v=ZycBUGLD22E

This is a two-hour long tutorial about submodularity that Jeff Bilmes presented at Microsoft Research in 2016. I have no idea how I found this (maybe I just started searching “submodularity” on YouTube or something), but it was a total godsend. I owe so much to this video: it filled in gaps I didn’t know I had and provided context I wouldn’t have known how to look for. The amount of information condensed into this video is truly impressive, and I spent many, many more than two hours watching and trying to digest it.

Bilmes also teaches a course on this, and I referenced his course notes a few times. Here’s the link: https://people.ece.uw.edu/bilmes/classes/ee563/ee563_fall_2020/

Background and other Links

Vital Sine’s YouTube series on Hypergraph Theory: https://youtube.com/playlist?list=PLZ2xtht8y2-IRjvXJJpka2GIJ3wn7u4g2
Jonatan Lindén’s technical report on Hypergraphs and Matroids: https://perso.ens-lyon.fr/eric.thierry/Graphes2007/jonatan-linden.pdf
The Wikipedia article on set functions: https://en.wikipedia.org/wiki/Set_function
The Wolfram alpha entry on set functions: https://mathworld.wolfram.com/SetFunction.html
Notes on submodular optimization from CS 880: Approximation and Online Algorithms, University of Wisconsin: https://pages.cs.wisc.edu/~shuchi/courses/880-F19/
Notes on Minimum Cuts and Modular Functions: https://www.robots.ox.ac.uk/~pankaj/notes/notes_cunningham.pdf
Cornell University, CS 6280: Algorithms - Bobby Kleinberg’s notes on Submodular functions: https://www.cs.cornell.edu/courses/cs6820/2019fa/handouts/submod.pdf
Chekuri’s Notes from his Approximation Algorithms course (CS 583, Spring 2018, Introduction Lecture): https://courses.grainger.illinois.edu/cs583/sp2018/Notes/intro.pdf
Cornell University Computational Optimization Open Textbook, Heuristic Algorithms page: https://optimization.cbe.cornell.edu/index.php?title=Heuristic_algorithms
The Wikipedia article on heuristics: https://en.wikipedia.org/wiki/Heuristic_(computer_science)
Local Search Algorithms in AI: A Comprehensive Guide: https://www.simplilearn.com/local-search-algorithms-in-ai-article
Wikipedia article on Relaxations: https://en.wikipedia.org/wiki/Relaxation_(approximation)
Charalampos E. Tsourakakis and Tianyi Chen, Dense subgraph discovery (DSD) theory and applications: https://tsourakakis.com/dense-subgraph-discovery-theory-and-applications-tutorial-sdm-2021/
Jan Vondrák, Continuous extensions of submodular functions: https://theory.stanford.edu/~jvondrak/CS369P/lec17.pdf

Academic Papers

A Survey on the Densest Subgraph Problem and its Variants (2024)
Authors: Tommaso Lanciano, Atsushi Miyauchi, Adriano Fazzone, and Francesco Bonchi
DOI: https://doi.org/10.1145/3653298
ArXiv Link: https://arxiv.org/abs/2303.14467

This is an academic paper that covers pretty much everything that’s been published about the densest subgraph problem (except for some very recent stuff).

Greedy approximation algorithms for finding dense components in a graph (2000)
Author: Moses Charikar
DOI: https://doi.org/10.1007/3-540-44436-X_10
here’s a link to a PDF

An academic paper about a greedy “peeling” algorithm for solving the DSP. This paper is very famous.

Flowless: Extracting Densest Subgraphs Without Flow Computations (2020)
Authors: Digvijay Boob, Yu Gao, Richard Peng, Saurabh Sawlani, Charalampos E. Tsourakakis, Di Wang, Junxing Wang
DOI: https://doi.org/10.1145/3366423.3380140
ArXiv Link: https://arxiv.org/abs/1910.07087

This is the original paper on “iterative peeling” for the DSP (Greedy++). These authors also ran a bunch of tests with real data to show how the algorithm performed.

Densest Subgraph: Supermodularity, Iterative Peeling, and Flow (2022)
Authors: Chandra Chekuri, Kent Quanrud, and Manuel R. Torres
DOI: https://doi.org/10.1137/1.9781611977073.64
PDF on Author’s Website: https://chekuri.cs.illinois.edu/papers/densest-subgraph-soda22.pdf

This is the paper that first uses supermodularity to study the DSP and iterative peeling through defining the “densest supermodular set problem”. This is where the algorithm SuperGreedy++ came from. The authors also analyze SuperGreedy++ and prove that it converges to a (1–epsilon)-approximation for the DSP. This paper also contains a bunch of other stuff that I didn’t look into too deeply.

Submodular functions and convexity (1983)
Author: László Lovász
DOI: https://doi.org/10.1007/978-3-642-68874-4_10

I read this paper because I was trying to retrace the origins of the “Lovász extension” and understand what it means. This is the original paper that talks about the links between submodularity, convexity, and concavity.

Submodular function minimization (2013)
Author: S. Thomas McCormick
Link: https://www.lamsade.dauphine.fr/~poc/jpoc8/McCormick-fcts-ss-modulaires.pdf
DOI (earlier version): https://doi.org/10.1016/S0927-0507(05)12007-6

This book chapter is a very complete introduction to submodular function minimization and I stole all sorts of stuff from it. Importantly, it contains a proof that the two definitions of submodularity are equivalent, which I very much appreciated seeing.

On finding dense subgraphs (2009)
Authors: Samir Khuller and Barna Saha
DOI: https://doi.org/10.1007/978-3-642-02927-1_50
PDF on Author’s Website: https://www.cs.umd.edu/~samir/grant/ICALP09.pdf

I think I looked at a proof technique or two in this paper? Honestly, I don’t remember anymore. What I can say is that it provides an alternative proof of a claim in Charikar’s 2000 paper (mentioned above)

Authoritative sources in a hyperlinked environment (1999)
Author: Jon M. Kleinberg
DOI: https://doi.org/10.1145/324133.324140
PDF: https://dl.acm.org/doi/pdf/10.1145/324133.324140

One of the sources I cited in my short survey of “real-world problems” in Part 2.

Studying complex discursive systems (2002)
Authors: Steven R. Corman, Timothy Kuhn, Robert D. McPhee, and Kevin J. Dooley
DOI: https://doi.org/10.1111/j.1468-2958.2002.tb00802.x

Another source I cited in my short survey of “real-world problems” in Part 2. This one is the linguistics paper that refers to a technique called “centering resonance analysis”.

Dense subgraph extraction with application to community detection (2012)
Authors: Jie Chen and Yousef Saad
DOI: https://doi.org/10.1109/TKDE.2010.271

Yet another source I cited in my short survey of “real-world problems” in Part 2.

A blog post where I explain the writing (creation?) process of this talk and series is now available here!

The short version is that I’d been wanting to give a talk about the importance of theory in CS for a long time, and I’d also been wanting to talk about the evolution of algorithms for solving the DSP. However, I was also taking a course where I had to give a technical presentation on algorithms for the DSP the day after I was giving this talk, so I decided to make the first half of it mostly about the important of theory and make the second half be somewhat practice for my presentation the next day. Now, this is a terrible way to give a coherent talk (and I’m sure it felt like I was giving two talks), but I didn’t think I’d get any other chances to give a talk that year, so I did what I felt like I had to do. ↩︎
I kid you not, I found a PDF of his slides, slowed the video down to 0.75 speed, and wrote down what he said almost word for word, which helped fill in some gaps. The key thing to note is that papers tend to assume the reader has the exact same background as the author – talks usually don’t, so they include quite a bit more context. Not enough that I could actually understand what was happening without prior background, but enough that I could begin to stumble through things and figure out what sort of things to look up. ↩︎