I’ve been doing some experimenting with Google Cloud Dataflow recently. This has gotten me interested in systems in this area. Here are two good related papers.
https://research.google.com/pubs/pub41378.html
http://dl.acm.org/citation.cfm?id=2522738
I believe the first is the basis for Cloud Dataflow and Apache Beam.
What I love about both of these is the work to find new abstractions that completely simplify the user facing model.