Django QuerySet vs. raw SQL performance considerations
Answer a question
I'm learning Django and its ORM data access methodology and there is something that I'm curious about. In one particular endpoint, I'm making a number of database calls (to Postgres) - below is an example of one:
projects = Project.objects\
.filter(Q(first_appointment_scheduled=True) | (Q(active=True) & Q(phase=ProjectPhase.meet.value)))\
.select_related('customer__first_name', 'customer__last_name',
'lead_designer__user__first_name', 'lead_designer__user__last_name')\
.values('id')\
.annotate(project=F('name'),
buyer=Concat(F('customer__first_name'), Value(' '), F('customer__last_name')),
designer=Concat(F('lead_designer__user__first_name'), Value(' '), F('lead_designer__user__last_name')),
created=F('created_at'),
meeting=F('first_appointment_date'))\
.order_by('id')[:QUERY_SIZE]
As you can see, that's not a small query - I'm pulling in a lot of specific, related data and doing some string manipulation. I'm relatively concerned with performance so I'm doing the best I can to make things more efficient by using select_related() and values() to only get exactly what I need.
The question I have is, conceptually and in broad terms, at what point does it become faster to just write my queries using parameterized SQL instead of using the ORM (since the ORM has to first "translate" the above "mess")? At what approximate level of query complexity should I switch over to raw SQL?
Any insight would be helpful. Thanks!
Answers
The question I have is, conceptually and in broad terms, at what point does it become faster to just write my queries using parameterized SQL instead of using the ORM (since the ORM has to first "translate" the above "mess")?
If you are asking about performance, Never.
The time taken to convert the ORM query into SQL will be very small compared to the time taken to actually execute that query. Brain cells are irreplaceable, servers are cheap.
If you are really do have performance issues the first place to look at is the your indexes in your models. Try printing out each of the queries generated by the ORM and run them in your psql console by prefixing EXPLAIN ANALYSE.
You can also use the django-debug-toolbar to automate this. In fact django-debug toolbar is an essential tool to hunt down bottlenecks. You will be surprised to note how often you have missed a simple select_related and how that causes hundreds of additional queries to be executed.
At what approximate level of query complexity should I switch over to raw SQL?
if you are asking about the ease of coding, it depends.
If the query is very very hard to write using the ORM and it's unreadable, yes, then it's perfectly fine to use a raw query. For example a query that has multiple aggregations, uses common table expressions, multiple joins etc can sometimes be hard to write as an ORM query, in that case if you are comfortable with raw sql writing it that way is fine.
更多推荐
所有评论(0)