A Guide to Looker Performance Optimization

Purpose: While many resources cover Looker performance, this guide synthesizes best practices into four key pillars:

  • Understanding Looker’s philosophy
  • Optimizing and simplifying execution
  • Enforcing guardrails and limits
  • Establishing insightful monitoring

The goal is to provide a starting point for improving performance while decreasing database consumption, which directly translates to lowered operating costs.


Step 1: Start with your Needs

Looker’s philosophy, centered on Explores, moves away from a general, one-size-fits-all BI model by design. Each Explore is a curated, use-case specific starting point for data analysis. Looker’s in-database architecture requires additional considerations:

  • Establish your Explores on top of a suitable reporting layer:

    • It may be tempting to establish your Explore on top of raw data immediately after ingesting it into your SQL speaking database. But due to Looker’s in-database architecture, it is strongly recommended to reduce complexity of real-time execution already at this level.

      • Only include data / rows which are relevant to address your questions.

      • Establish a meaningful level of pre-aggregation while still allowing users to drill into the data as deep as necessary to answer their business questions.

      • Materialize and orchestrate general, repetitive transformations in-database.

      • Index, cluster & partition your tables in an optimal way to support your reporting needs.

      • Align capacity & resource allocation models with demands (meaning in regards to data, usage patterns, etc.).

    • Suitable pre-aggregation and optimization resulting in simpler / lighter queries is easily one of the most effective optimization strategies.

  • Design Explores small and focused:

    • Rather than every potential option, Explores should be designed to address a specific set of business questions. This doesn’t necessarily mean you have to repeat yourself. Leverage refinements and extends for views or Explores to keep things nice & tidy.

    • Only include tables that are required for reporting.

  • Select your base view carefully:

    • Start each Explore with the most logical “fact” or primary view for that use case. This choice dictates the primary entry point of each query.

    • It also defines how subsequent joins have to be defined. Whenever possible aim for building Explores using many_to_one relationships.

  • Optimize join logic:

    • Reduce complexity of close-to-real-time query execution by critically reviewing the number & type of joins required.

    • Whenever possible avoid full_outer type joins. Use left_outer or inner instead.

    • Avoid joining views on concatenated primary keys. Join on the base fields instead.

    • Consider further denormalization to avoid frequently processed joins.

Step 2: Optimize Execution

Once your foundation is outlined & optimized, use some of Looker’s in-build capabilities to run queries more efficiently:

  • Define Caching strategies in sync with your ELT / based on data freshness needs:

    • Looker reduces the load on your database and improves performance by using cached results of prior SQL queries. Depending on usage patterns a healthy cache score may be somewhere in the ballpark of 50-70% or more.

    • There is no one size fits all caching strategy across Explores. Instead, individually for each Explore, aim to maximize cache expiration based on data freshness needs. Always keep in line with the sets of business questions to be answered.

    • Don’t have a refresh rate higher than that of the underlying reporting layer.

    • When connecting Looker to your database, consider using shared authentication instead of user-specific credentials (unless there is a good reason not to). This will allow sharing cache across users.

  • Leverage Looker’s Aggregate Awareness functionality:

    • Looker allows creating roll-ups or summary tables on a fully customizable schedule. Once materialized, Looker will automatically identify the smallest, most efficient roll-up available and redirect your query while still maintaining accuracy.

    • A simple starting point to achieve substantial performance benefits may be roll-ups on date dimensions (day, week, month, year) or roll-ups through exclusion of fragmented columns holding, e.g. very granular, but rarely used values.

  • Reduce reliance on ephemeral Derived Tables when possible:

    • In general, try to avoid extensive usage of derived tables. Derived table builds can result in huge load on your database, potentially failing due to resource limitations or timeouts. This may ultimately interfere with user experience.

    • Even if LookML is established on top of a pre-aggregated reporting layer, there may be occasions where you need to rely on derived tables (e.g. to reflect CTEs or subqueries which are dynamically dependent on user interaction). In general, try using persist_for or datagroup_trigger to materialize these tables for later redirection rather than repetitive execution. Add partition_keys and indexes if possible.

    • Similar to caching policies, be mindful about your persistence strategy. It is best practice to refresh in line with business needs, system load & update cycles of the underlying data foundation.

Step 3: Simplify Queries

When implementing additional fields in LookML (dimensions & measures) always carefully consider future execution cost.

  • Move costly operations upstreams (if possible):

    • Move repetitive, costly operations, such as analytical or window calculations, string & time manipulations or deeply nested logic out of LookML definitions upstream into your data warehouse, if possible.
  • Simplify access management / RLS logic.

    • When implementing access management / RLS via user attributes plus access_filter, be mindful of the logic which gets injected into the WHERE clause of every query. Simplify user groups and the filter conditions to ensure the injected SQL is fast for the database to evaluate.

Step 4: Define Guardrails & Limits

Once you have established meaningful Explores, optimized execution & simplified queries, a recommended next step is to define guardrails & limits on how users can interact with this foundation.

  • Restrict access (object level):

    • Limit user access to only those Explores which are necessary to answer relevant business questions (e.g. using user attributes and access_grant / required_access_grant). This prevents users from running costly queries by accidentally or unknowingly selecting the wrong, potentially much less aggregated, Explore.
  • Restrict access (row level):

    • Ensure to restrict access to historic or full data through soft (always_filter) or hard (sql_always_where) enforcement of RLS / WHERE clause. Be aware that this only reduces consumption when established on a partitioned column. Partition filter requirements can also be enforced directly on your database, however, this may lead to unexpected behavior if the required filter is not included within your Looker analysis.
  • Simplify content:

    • Usability aside, avoid dashboards with more 20-25 widgets, as the concurrent queries can significantly impact user experience. Instead use e.g. drill_fields or cross-links to break down content into manageable pieces.

    • If creating central, frequently and concurrently accessed assets, consider using the above techniques to establish a separate, optimized-for-purpose Explore just to serve this specific dashboard / look.

  • Ensure to establish limits on Looker / your database:

Step 5: Establish Monitoring

Trust is good, control is better. Even on top of an optimized setup surrounded by meaningful guardrails and covered through limits, ensure you always stay on top of things.

  • Use Looker System Activity to monitor performance & consumption

    • Looker’s internal LookML model, System Activity, provides a rich in-built functionality to monitor content usage, connection details, query & asset performance, PDT logs & more. Review these logs regularly to further optimize consumption. Set yourself alerts to identify issues early on and take necessary actions.
  • Monitor traffic / billing on your database

    • Most, if not all, databases provide granular logging to consistently monitor cost and potentially even billing. Use these insights to optimize your consumption. Review these logs regularly to further optimize consumption. Set yourself alerts to identify issues early on and take necessary actions.

Step 6: Training & Awareness

While following steps one through five is essential for a proper and scalable Looker implementation, it is strongly recommended to additionally train and sensitize business users. Increasing awareness of how Looker operates will directly improve how they interact with the system. Ultimately, this may lead to users acting in more alignment with anticipated usage patterns, maximizing effectiveness of the implemented optimization techniques.

  • Documentation & Guidance

    • Provide clear, accessible resources that explain what each Explore is designed for. Establish transparent guidelines that help selecting the most appropriate asset (Dashboard, Explore, Agent).
  • Sensitize on Implications

    • Educate users on the direct impact of their query choices (e.g. how running a massive, unfiltered query translates to slower performance and higher database cost). Looker’s Explore query tracker and performance panel is a great starting point, even for business users, to gain some insights into query performance.

Conclusion:

Reviewing your Looker deployment using these six pillars is a first step towards creating a highly efficient, cost-effective data engine. By prioritizing a lean reporting layer and leveraging built-in features like aggregate awareness and caching, you can significantly reduce the computational load on your database. These structural optimizations, when paired with robust guardrails and proactive system monitoring, ensure that performance remains consistent as your user base scales. Ultimately, success lies in the synergy between technical precision and user education, fostering a culture of data mindfulness. Following this optimization guide will not only improve the end-user experience through faster insights but also deliver immediate value by lowering operational costs.


Additional Resources

3 Likes

Thanks, this is a great summary of all the steps that can be taken to optimise performance.