Telemetry & Observability for Elixir Apps at Cars.com with Zack Kayser & Ethan Gunderson


Episode Artwork
1.0x
0% played 00:00 00:00
Dec 12 2024 42 mins   17

Zack Kayser and Ethan Gunderson, Software Engineers at Cars Commerce, join the Elixir Wizards to share their expertise on telemetry and observability in large-scale systems. Drawing from their experience at Cars.com—a platform handling high traffic and concurrent users—they discuss the technical and organizational challenges of scaling applications, managing microservices, and implementing effective observability practices.



The conversation highlights the pivotal role observability plays in diagnosing incidents, anticipating system behavior, and asking unplanned questions of a system. Zack and Ethan explore tracing, spans, and the unique challenges introduced by LiveView deployments and WebSocket connections.



They also discuss the benefits of OpenTelemetry as a vendor-agnostic instrumentation tool, the significance of Elixir’s telemetry library, and practical steps for developers starting their observability journey. Additionally, Zack and Ethan introduce their upcoming book, Instrumenting Elixir Applications, which will offer guidance on integrating telemetry and tracing into Elixir projects.



Topics Discussed:




  • Cars.com’s transition to Elixir and scaling solutions

  • The role of observability in large-scale systems

  • Uncovering insights by asking unplanned system questions

  • Managing high-traffic and concurrent users with Elixir

  • Diagnosing incidents and preventing recurrence using telemetry

  • Balancing data collection with storage constraints

  • Sampling strategies for large data volumes

  • Tracing and spans in observability

  • LiveView’s influence on deployments and WebSocket behavior

  • Mitigating downstream effects of socket reconnections

  • Contextual debugging for system behavior insights

  • Observability strategies for small vs. large-scale apps

  • OpenTelemetry for vendor-agnostic instrumentation

  • Leveraging OpenTelemetry contrib libraries for easy setup

  • Elixir’s telemetry library as an ecosystem cornerstone

  • Tracing as the first step in observability

  • Differentiating observability from business analytics

  • Profiling with OpenTelemetry Erlang project tools

  • The value of profiling for performance insights

  • Making observability tools accessible and impactful for developers



Links Mentioned



https://www.carscommerce.inc/

https://www.cars.com/

https://hexdocs.pm/telemetry/readme.html

https://kubernetes.io/

https://github.com/ninenines/cowboy

https://hexdocs.pm/bandit/Bandit.html

https://hexdocs.pm/broadway/Broadway.html

https://hexdocs.pm/oban/Oban.html

https://www.dynatrace.com/

https://www.jaegertracing.io/

https://newrelic.com/

https://www.datadoghq.com/

https://www.honeycomb.io/

https://fly.io/phoenix-files/how-phoenix-liveview-form-auto-recovery-works/

https://www.elastic.co/

https://opentelemetry.io/

https://opentelemetry.io/docs/languages/erlang/

https://opentelemetry.io/docs/concepts/signals/traces/

https://opentelemetry.io/docs/specs/otel/logs/

https://github.com/runfinch/finch

https://hexdocs.pm/telemetry_metrics/Telemetry.Metrics.html

https://opentelemetry.io/blog/2024/state-profiling

https://www.instrumentingelixir.com/

https://prometheus.io/

https://www.datadoghq.com/dg/monitor/ts/statsd/

https://x.com/kayserzl

https://github.com/zkayser

https://bsky.app/profile/ethangunderson.com

https://github.com/open-telemetry/opentelemetry-collector-contrib

Special Guests: Ethan Gunderson and Zack Kayser.