mwh's blog
python, maths, partying and climbing

pydoctor convenience:

It's been a while since I've done much with pydoctor (and even longer since I posted here) but I've just added what I think is a reasonably nifty new command to it.

If you get revno 525 or newer of https://code.edge.launchpad.net/~mwhudson/pydoctor/dev, 'pydoctor --auto' should find all packages and modules in the current directory, analyze them and pop up your browser showing the index page for the api docs of what it found. pydoctor will be running in it's poorly advertised 'editing' mode, where it runs an HTML server that allows you to edit docstrings and get a diff of all the edits you've made.

Please try it and see how you find it!

[Comments] (3) pydoctor convenience:

It's been a while since I've done much with pydoctor (and even longer since I posted here) but I've just added what I think is a reasonably nifty new command to it.

If you get revno 525 or newer of https://code.edge.launchpad.net/~mwhudson/pydoctor/dev, 'pydoctor --auto' should find all packages and modules in the current directory, analyze them and pop up your browser showing the index page for the api docs of what it found. pydoctor will be running in it's poorly advertised 'editing' mode, where it runs an HTML server that allows you to edit docstrings and get a diff of all the edits you've made.

Please try it and see how you find it!

some nevow silliness: I have an application (pydoctor) that uses nevow for its html generation. One of the newer classes uses a big "loaders.stan" for its docFactory, and this was getting a bit unwieldy so I wanted to switch to using loaders.xmlfile. But that meant I had to convert the stan expression into a template, which seemed like it would be pretty boring. So I hacked instead, and wrote an implementation of nevow.inevow.IRendererFactory that, when the stan is rendered, produces as output xml suitable for use with loaders.xmlfile.

The (fairly insane in places) code is here. For an example of what it does, it turns this into this. It doesn't support slots or patterns or ..., just those things I needed so far. It says something for nevow that this is even remotely possible :-)

As I've been learning bzr recently, I've made http://python.net/crew/mwh/hacks/stan2xml/ a bzr branch. Try bzr branch http://python.net/crew/mwh/hacks/stan2xml/.

EuroPython 2007: Call for Proposals: After a some late night CPS wrangling (fortunately mostly not done by me; Paul Boddie is a hero), I'm very happy to announce the EuroPython 2007 call for proposals:

http://www.europython.org/sections/tracks_and_talks/announcements/call-for-proposals

on behalf of all the people involved in organizing EuroPython this year.

Calling the Python web community: I'm program chair for EuroPython (9th-11th July in Vilnius this year), and I need your help: a huge part of the Python community is involved in web-related technologies but I am not, so I'd like to get some help to make sure we get lots of quality submissions, and to assess the submissions we get. If you think you can help with this, please let me know! Either leave a comment here, mail me (micahel@gmail.com) or best of all, come along to our meeting in #europython on freenode tonight:

http://codespeak.net/svn/user/mwh/ep2007/meeting-agenda-2007-03-14.html
Let's keep the web track at EuroPython great!

2008 May
MonTueWedThuFriSatSun
   1234
567891011
121331415161718
19202122232425
262728293031 

0 entries this month.

Categories Random XML
Password:
Memory profiler
Progress on a memory profiler for the Google Summer of Code

[Comments] (1) :

I've implemented the frame-recording thing I posted about yesterday. It's now the function annotate.findcreators, and it seems to work *very* well. In fact, I think I prefer it to annotate.groupby: it doesn't have to guess, it seems to give better information and it might work well in a GUI (the information it gives is hierarchical).

And I can't help noticing that the SoC is coming to an end...so I thought I should post about what I've done.

* Everything is split (with the exception of annotate.findcreators at the moment) into Python-implementation-specific (CPython, PyPy, IronPython etc.) and implementation-dependent parts. There's a cpython directory which contains everything CPython-specific.

To get another implementation running, you should only need to write classes to scan instances of object and whatever other types you have that need special treatment, and write functions to get a set of roots and to get the name of a type.

* The scanner works. It's written in an implementation-independent way (I hope). If the implementation supports type.mro(), it can construct a wrapper of a type from the wrappers of its superclasses - i.e. given code to scan class Foo and class Bar, it can scan class Baz(Foo, Bar).

In practice this means that only scanning code for very strange objects (in the CPython implementation, C objects) needs to be defined, and the scanner will sort out the rest itself.

The scanner gets wrapper classes from the module sizes, which should be filled in by the implementation-specific part, and which can also be used to treat some objects and types specially. For example, the scanner puts an entry in here which causes no wrappers to be counted as taking space.

* The code is quite useful, with a fair amount of ways of analysing the scanner's output (including making graphs :-). In a short time I found some waste of memory in SCons and in about half an hour reduced its memory usage from 24MB to 22MB. (I haven't tried to reduce it any further yet - there's certainly room for improvement.)

* There are (hopefully useful) docstrings on all public functions and classes. I've written part of a tutorial, which has links to HappyDoc-generated docs made from the docstrings, and a document describing the internals of the scanner.

* The CPython-specific code can scan all built-in types which are at all possible to scan.

On the other hand:

* I haven't written any code to scan any types from extension modules. I will do when it becomes a problem - lots of programs don't use extensions much.

* There's a bit of a mess when making copies of wrappers. I think I can fix that by adding an extra level of indirection.

* Each wrapper contains a reference to the object it's wrapping, and quite a lot of code uses it. That's not very good for mutable types, since the object can have changed when code looks at it, and it also prevents pickling the sets of wrappers for future use.

* There's no nice GUI yet...once I've fixed the problems with wrappers, I think I'll have a go at writing one.

Also, I want to port it to other implementations. PyPy looks very tempting :-)

Algorithms and whatnot:

I've been trying to think of the best way to use the frame recording code.

It could be useful to be able to see all the functions which created objects along with which objects they created. Then you could look back to see which functions called them, and which objects were created by the back function calling the other function.

Then you could just give out a head which represented Py_NewReference, or something like that, and created all the objects. The back functions of that would be the pieces of code which created objects themselves.

That seems to be a sensible way of representing it, I think. I'm not quite sure how to implement that...it looks like a suffix tree might be the best way.

More fiddling with SCons, and some more docs:

I've looked at SCons some more. A huge amount of its memory is taken up by dicts. Instances of BuildInfo seem to take up a lot of space too, but they're really a big mess and it doesn't look too easy to fix. Never mind.

I've started writing a document about how the profiler works internally. It's also an excuse to learn LaTeX :-) At the moment it goes through how the scanner works, and has a couple of ideas for improving it.

Trying on SCons:

The profiler found that there are lots of empty dictionaries in instances of SCons.Node.FS.File, so I replaced a few of them with wrappers which only make a real dictionary when needed.

Before the change (output is from scons --debug=memory -n):
Memory before SConscript files: 7516160
Memory after SConscript files: 17731584
Memory before building: 17731584
Memory after building: 23470080

After the change:
Memory before SConscript files: 7520256
Memory after SConscript files: 16392192
Memory before building: 16392192
Memory after building: 21549056

So that small change (writing a lazy dictionary class which took about half an hour, and 28 other lines, mostly changing {} to lazydict()) has reduced memory usage by 2MB out of 24. Which is not bad, but not very big either. I'll have another look later to see what else can be reduced.

:

Well, I've got the recording-frames idea going, but no code to use it yet. It needed changes to _Py_NewReference and _Py_Dispose in the end. Also, I've made the wrappers use slots, so the profiler itself uses a *lot* less memory than before.

I've been trying to profile SCons. Looks hopeful: when compiling Unununium, there are several megabytes of empty dicts and lists. Maybe it would help to have a lightweight dict-wrapper class that only created a real dict when it was written to.

2005 September
MonTueWedThuFriSatSun
   12234
567891011
12131415161718
19202122232425
2627282930  

0 entries this month.

Random XML
Password:

Unless otherwise noted, all content licensed by Michael Hudson
under a Creative Commons License.