Summary
If you are interested in a library for working with graph structures that will also help you learn more about the research and theory behind the algorithms then look no further than graph-tool. In this episode Tiago Peixoto shares his work on graph algorithms and networked data and how he has built graph-tool to help in that research. He explains how it is implemented, how it evolved from a simple command line tool to a full-fledged library, and the benefits that he has found from building a personal project in the open.
Announcements
- Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science.
- When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
- Your host as usual is Tobias Macey and today I’m interviewing Tiago Peixoto about graph-tool, an efficient Python module for manipulation and statistical analysis of graphs
Interview
- Introductions
- How did you get introduced to Python?
- Can you describe what graph-tool is and the story behind it?
- What are some scenarious where someone might encounter a graph oriented data set?
- In what ways are those graphs typically represented?
- In your experience, what is the overlap of people who are working with networked data, and the use of graph-native databases? (e.g. Neo4J, DGraph, etc.)
- What kinds of analysis or manipulation might someone need to perform on a graph structure?
- There are a few different tools in Python for working with networked data. How would you characterize the current ecosystem and why someone might choose graph-tool?
- Can you describe how graph-tool is implemented?
- How have the goals and design of the package changed or evolved since you first began working on it?
- Who are your target users and what are the guiding principles that you use to inform the API design for the package?
- How much knowledge of graph theory or algorithms are required to make effective use of graph-tool?
- Can you talk through an example workflow of using graph-tool to load, process, and analyze a graph?
- What are some of the overlooked or underutilized aspects of graph-tool that you think more people should know about?
- What are some systems/applications that you have seen which would be simplified by adopting a graph model for their data?
- What is your impression of the overall awareness of the benefits of graphs for simplifying aspects of data processing and analysis?
- What are some cases where a graph structure adds unnecessary complexity?
- What are the most interesting, innovative, or unexpected ways that you have seen graph-tool used?
- What are the most interesting, unexpected, or challenging lessons that you have learned while working on graph-tool?
- When is graph-tool the wrong choice?
- What do you have planned for the future of graph-tool?
Keep In Touch
Picks
Closing Announcements
- Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story.
- To help other people find the show please leave a review on iTunes and tell your friends and co-workers
- Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links
- Central European University
- NetworkX
- GML
- GraphML
- Neo4J
- DGraph
- NetworKit
- igraph
- Matplotlib
- C++ Templates
- Boost Graph Library
- OpenMP
- Maximum Matching
The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA