Patterns

Query Optimization

Write faster SPARQL queries. Learn query planning principles, avoid common performance pitfalls, and use Wikidata-specific hints.

Optimization Principles

Principle Why It Helps
Filter early Reduce data volume before expensive operations
Be specific Narrow patterns match fewer triples
Limit results Stop processing when you have enough
Use indexes Property patterns use indexes; calculations don't
Avoid cartesian products Unrelated patterns multiply result counts

Pattern Order Matters

Slow: Filter After Large Scan
Run ↗
# SLOW: Scans all cities, then filters
SELECT ?city ?cityLabel ?pop
WHERE {
  ?city wdt:P31/wdt:P279* wd:Q515 .  # ALL cities worldwide
  ?city wdt:P1082 ?pop .
  ?city wdt:P17 wd:Q142 .  # Then filter to France

  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
LIMIT 20
Fast: Filter First
Run ↗
# FAST: Filter to France first, then get details
SELECT ?city ?cityLabel ?pop
WHERE {
  ?city wdt:P17 wd:Q142 ;  # France first (smaller set)
        wdt:P31/wdt:P279* wd:Q515 ;
        wdt:P1082 ?pop .

  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
LIMIT 20

Avoid Unbounded Traversals

Slow: Unlimited Depth
Run ↗
# SLOW: Can traverse entire class hierarchy
SELECT ?class ?classLabel
WHERE {
  ?class wdt:P279* wd:Q35120 .  # entity - very broad!

  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
LIMIT 100
Fast: Limited Depth
Run ↗
# FAST: Limit traversal depth
SELECT ?class ?classLabel
WHERE {
  ?class wdt:P279{1,3} wd:Q9143 .  # programming language, 1-3 levels

  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}

Prefer Direct Properties Over Calculations

Slow: String Comparison
Run ↗
# SLOW: String comparison can't use indexes
SELECT ?item ?itemLabel
WHERE {
  ?item wdt:P31 wd:Q9143 ;
        rdfs:label ?label .
  FILTER(STR(?label) = "Python")

  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
Fast: Direct Reference
Run ↗
# FAST: Use entity directly
SELECT ?item ?itemLabel
WHERE {
  VALUES ?item { wd:Q28865 }  # Python directly

  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}

Use CONTAINS Instead of REGEX

Slow: Full Regex
Run ↗
# SLOW: Regex is expensive
SELECT ?lang ?langLabel
WHERE {
  ?lang wdt:P31 wd:Q9143 ;
        rdfs:label ?label .
  FILTER(REGEX(?label, "script", "i"))

  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
LIMIT 20
Fast: CONTAINS
Run ↗
# FASTER: CONTAINS is simpler
SELECT ?lang ?langLabel
WHERE {
  ?lang wdt:P31 wd:Q9143 ;
        rdfs:label ?label .
  FILTER(CONTAINS(LCASE(?label), "script"))

  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
LIMIT 20

Wikidata-Specific: Optimizer Hints

hint:Prior for Join Order
Run ↗
SELECT ?city ?cityLabel ?pop
WHERE {
  # Force this pattern to be evaluated first
  ?city wdt:P17 wd:Q142 .
  hint:Prior hint:runFirst "true" .

  ?city wdt:P31/wdt:P279* wd:Q515 ;
        wdt:P1082 ?pop .

  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
LIMIT 20

Subqueries for Complex Logic

Subquery to Limit First
Run ↗
SELECT ?lang ?langLabel ?designerLabel
WHERE {
  # First: get top 20 languages by sitelinks
  {
    SELECT ?lang
    WHERE {
      ?lang wdt:P31 wd:Q9143 ;
            wikibase:sitelinks ?sitelinks .
    }
    ORDER BY DESC(?sitelinks)
    LIMIT 20
  }

  # Then: get expensive details only for top 20
  OPTIONAL { ?lang wdt:P287 ?designer . }

  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}

Optimization Checklist

Performance Comparison

Slow Pattern Fast Alternative
FILTER(STR(?x) = "...") VALUES ?x { wd:Q... }
REGEX(?x, "...") CONTAINS(?x, "...")
wdt:P279* unbounded wdt:P279{1,5} limited
Filter at end Selective pattern first
SELECT * SELECT ?specific ?variables
No LIMIT LIMIT during development