|
|
||||||||||||||||||
|
|
||||||||||||||||||
![]() |
![]() |
Issue 9 - Revision 8 / February 7, 2005
|
|||
|
Profiling Zope - (Part 2 of 2) - - - - - - - - - - - - By Michel Pelletier | September 21, 2004 Part II: Profile Analysis Part One of this two-part series explained Python and Zope profiling, and gave a short introduction to profling concepts. Now that you understand a bit about the theory of profiling and how to turn it on in Zope, I'll tell you about the numbers the profiler spits out, and what they mean. When you run a Zope profile, the profiler gives you a table of information. Here is a screenshot of the table that Zope gives you when you run a profile:
'Sort' changes the column of data by which the information is sorted. The three most useful columns to sort your information by are:
Already you can see how this information would be of use. If you want to see which method is taking the most time in your Zope site, sort the information by 'time'. If that doesn't show you much, maybe your methods are taking lots of time because they themselves are calling lots and lots of little methods, so try sorting by 'cumulative'. If this is the case, you can confirm it by sorting by 'calls'. In addition to looking at the data these three ways, the profiler can also show you who called what. Over to the right you will notice that you can select a 'Mode'. So far you have been sorting by 'stats', but there are two other modes you can choose, 'callers' and 'callees'. The 'callers' mode lists the calling methods and shows the methods they called. The 'callees' mode lists the called methods and shows what methods they were called by. Using Apache BenchmarkApache Benchmark (ab) is a simple HTTP client program that executes the same request to a server over and over a set number of times. You can also configure ab to issue requests simultaneously. ab is used to simulate "real-word" load on your server. Obviously, the more you can get your profile tests to look like real user requests the more accurate the profile data will be. The general pattern is to start up Zope with profiling turned on and then use 'ab' to issue a whole bunch of requests. You can then analyze the profile data from those requests to look for bottlenecks. How wide or narrow you wish to make the requests is up to you and is based on the problem at hand but, generally, narrow requests will show you more specific information. A word of caution: while 'ab' is directing requests at your site any other requests or activities on your Zope site will skew the results by "polluting" your test case with outside profile data. Even just logging in or going to the Control Panel to look at the profile data in the middle of the tests will throw your results off somewhat. You're just asking for confusion if you do this, so don't do it. Keep everyone logged out, turn on profiling, run the tests using 'ab', and only then go and look at the information on the Profiling screen. An ExampleHere is a tricky bit of code to profile. It runs quickly enough, so at first it might be acceptable to you, but when you are looking to squeeze more speed out of your site, it's code like this that you're going to have to look for, and look hard. Consider the following Page Template:
This is a trivial workflow-inspired example, but in big, complex templates it's easy to make the kind of mistake we describe below. Here you see that a deep folder is being iterated over, and a method in the folder is being called for each object to get the 'submitter' for the file. If you recreate this folder and populate it with lots of files (and add a simple 'getSubmitter' method that just returns a string - I also padded my 'getSubmitter' with a loop to exaggerate its expense), you'll see that it runs slow, and it gets slower the more files you add. Can you guess what is taking the most time? Do you think it's 'getSubmitter'? If you run the profiler, you might be surprised. Turn it on and try it out: here are my results for the above template with the folder containing a hundred files:
ncalls tottime percall cumtime percall filename:lineno(function)
21060 3.230 0.000 4.890 0.000 ZopeGuards.py:220(guard)
2132 2.720 0.001 3.280 0.002 Expressions.py:290(restrictedTraverse)
36410 1.740 0.000 1.740 0.000 SecurityManagement.py:19(getSecurityManager)
3 1.600 0.533 1.600 0.533 ApplicationManager.py:183(refcount)
23166 1.410 0.000 6.300 0.000 ZopeGuards.py:209(next)
2119/13 1.320 0.001 28.320 2.178 TALInterpreter.py:218(interpret)
2106 1.310 0.001 7.760 0.004 Script (Python):1(getSubmitter)
Surprised? Even with my attempt to hobble it with a useless loop, 'getSubmitter' is only seventh on the list, totalling just over one second of time (the methods used, in order of frequency, are: guard; restrictedTraverse, getSecurityManager; refcount; next; interpret; and getSubmitter). So what is taking all the time? Zope's security machinery! The first item, 'guard' has almost three times the total time of 'getSubmitter'. Other security-related methods like 'restrictedTraverse', 'getSecurityManager', and 'next' also top the list. So whereas you at first might have thought 'getSubmitter' was the right method to optimize, what you really need to do is reduce the number of security checks being carried out. Note that each time the template iterates over the folder, it calls the following line:
Here is where a lot of security checks are made. In TAL, every time you specify a path element, the security system checks to make sure you have permission to reach that object. So the above line triggers a check for 'here', 'Library', 'OfficeFiles', 'Drafts', and finally 'getSubmitter' every time it goes through the loop. Using a simple 'tal:define' you can perform these checks once:
Now the lookup of all the folders in between this script and 'getSubmitter' is done once. Re-run the tests (don't forget to restart Zope to clear the old profiler stats!) and now you can see that the template has been sped up significantly:
ncalls tottime percall cumtime percall filename:lineno(function)
16200 2.580 0.000 3.710 0.000 ZopeGuards.py:220(guard)
3 1.620 0.540 1.620 0.540 ApplicationManager.py:183(refcount)
17820 1.330 0.000 5.040 0.000 ZopeGuards.py:209(next)
24479 1.130 0.000 1.130 0.000 SecurityManagement.py:19(getSecurityManager)
1620 0.940 0.001 6.130 0.004 Script (Python):1(getSubmitter)
'getSubmitter' has moved from seventh to fifth, and all the time spent in the security machinery has been reduced both with respect to time and to the number of calls ('guard' was called almost 5000 times less!). Keep in mind that this technique can also be applied to Python scripts. The key to avoiding these problems is to minimize the use of a '/' in Page Templates or a '.' in Python scripts, both of which trigger security checks. Can this example be made even faster? You bet. The task above was really an example for doing things the naive way in Zope. Whenever you need to present objects and metadata about them (such as their 'submitter') always think of ZCatalog. Optimizing this one use case has been the purpose of ZCatalog since the very beginning. With the ZCatalog, the above example could be reduced to a few milliseconds, with a bare minimum number of security checks. Setting this up involves a "Path Index" and is left to you as an example, but I recommend you check out The Zope Book on Zope.org or Andy McKay's upcoming book The Definitive Guide To Plone which covers cases like this in great detail. Using a ZEO Client ProfilerLast, I'd like to mention ZEO. ZEO is great for large sites that need to distribute the load of the site, but it also has some unexpected benefits for the developer. While your "production" ZEO clients are serving up your site, you can spawn your own "private" ZEO clients to debug or profile the site without affecting your production ZEO clients at all. This is very useful for debugging a "live" site after it has been deployed. ZEO clients that are profiling your site will, in theory, not interfere with your other ZEO clients which are serving the site "in production". The only way a profiling ZEO client can affect others is through the ZEO server database (which is not being profiled), and profiling is so CPU-intensive on the ZEO client that it will be at a significant disadvantage and lower database resource priority on the ZEO server more than the production clients. Actually, you don't have to worry about how the profiler will affect your ZEO setup unless you are profiling code that causes lots of ZODB “ConflictErrors”, which occur when two ZODB connections attempt to write to the same object. Profiling code that causes lots of conflicts can effect the behavior of a profiled ZEO client because the client runs slower and thus cannot attempt as many conflicting writes. I mention this because normally you can consider a ZEO client to be completely separate from other ZEO clients, but when profiling buggy or conflicting behavior you have to take the whole system into consideration. ZEO clients can also be brought up in debug mode, bringing you to a Python interpreter prompt. There, you can use the 'profile' and 'pstats' Python modules directly to profile any code from the prompt. For more information on these modules which underlie the Zope profiler support, see the Python documentation on Python.org.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||
| ZopeMag is committed to bringing you the best in Zope Documentation. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
![]() |
Reproduction of material from any of ZopeMag's pages without prior written permission is strictly prohibited. Copyright 2003 - 2005 ZopeMag |
|