==== Channel ##pypy: 05/11/05 ====

[00:03] arigo (~arigo@c-3a8b70d5.022-54-67626719.cust.bredbandsbolaget.se) left irc: "[x]chat"

----- silence for 1 hr and 47 minutes -----

[01:50] ludal (~ludal@lab75-1-81-57-254-81.fbx.proxad.net) left irc: Remote closed the connection

----- silence for 2 hr and 21 minutes -----

[04:11] fredrik (fredrik@c83-248-135-181.bredband.comhem.se) left irc: "http://fredrikj.net";

----- silence for 4 hr and 49 minutes -----

[09:00] dialtone (~dialtone@host111-56.pool80117.interbusiness.it) left irc: "This computer has gone to sleep"

[09:02] aleale (~redorlik@cpe.atm0-0-0-129140.0x3ef2fa3a.bynxx3.customer.tele.dk) joined #pypy.

----- silence for 54 minutes -----

[09:56] thingie24 (~rmt38@valhalla.ccp.cc) left irc: Read error: 104 (Connection reset by peer)

[10:01] thingie24 (~rmt38@valhalla.ccp.cc) joined #pypy.

----- silence for 28 minutes -----

[10:29] stakkars (xwcznl@dsl-62-220-9-38.berlikomm.net) joined #pypy.

[10:39] arre (ac@ratthing-b3fa.strakt.com) joined #pypy.

----- silence for 1 hr and 42 minutes -----

[12:21] arigo (~arigo@c-3a8b70d5.022-54-67626719.cust.bredbandsbolaget.se) joined #pypy.

[12:21] <arigo> hi

----- silence for 46 minutes -----

[13:07] <aleale> hi

----- silence for 1 hr and 51 minutes -----

[14:58] <stakkars> hi

[14:59] <arigo> stakkars: we're starting to work on genc

[15:00] <arigo> but we're wondering what changes you have in your working copy

[15:00] <arigo> actually, it would be nice if you'd work on a branch, so that we can follow you :-)

[15:01] <stakkars> you mean that scoped stuff I was blathering about?

[15:01] <stakkars> this is still in brainstorming mode.

[15:01] <arigo> no, what you did before already

[15:02] <arigo> e.g. we're wondering how you get 20x speed-ups on rpystone

[15:02] <arigo> and we're also wondering if we can refactor genc or if it will break all your work

[15:02] <stakkars> well, a couple of unary operators are to be checked in, no conflicts expected.

[15:02] <stakkars> ah the 20x speed was just by a modification of rpystone, where I commented out

[15:03] <stakkars> all operations which induce heavy usage of objects. Sorry if I invoked more expectations.

[15:03] <stakkars> I just wanted to see the gain of the pure integer stuff.

[15:03] <arigo> no problem. what you invoked was the fear that you had large diffs sitting in your working copy :-)

[15:04] <stakkars> you can refactor, it will not break much.

[15:04] <stakkars> for now, I'd just like to implement what you give me to implement.

[15:06] <stakkars> I'm also working on the pickling idea.

[15:07] <stakkars> As a side effect, I'd like to intrduce compact flowgraphs, which have a very tiny footprint.

[15:07] <stakkars> They are read-only and can be unpacked, again.

[15:07] <arigo> I see

[15:08] <arigo> well on the lltype front, there'll soon be need for the implementation in this strange style of the RPython operators and methods

[15:08] <stakkars> I did a bit of statistics. A problem with our current structure is that it creates myriads of structures, serveral

[15:09] <stakkars> megabytes in memory. Before really swapping them out, it might make sense to use a compact structure.

[15:09] <stakkars> The idea is to build operations which don't contain their operands, but jsut use index tuples.

[15:10] <stakkars> The compact block has a table indexed by these indices with the variables.

[15:10] <arigo> but you're not saving any memory this way ..?

[15:10] <stakkars> that makes the operations structurally similar, and they can be folded by a dict.

[15:11] <stakkars> very much memory, ofcourse.

[15:11] <stakkars> because operations are similar by shape and indices,and immutable, they can be shrink-washed by a dict.

[15:12] <stakkars> The same holds for the blocks themselves. Their operations can be turned into tuples of

[15:12] <stakkars> compact operations and be folded away, because we have very many of similar patterns.

[15:12] <arigo> sorry, I don't follow you

[15:13] <arigo> you're proposing a compact form that doesn't use Variable instances at all?

[15:13] <stakkars> it uses variable instances. But it is not necessary to storee direct references to these variables.

[15:14] <stakkars> Instead, we can use small indices. But these index patterns are showing up thousands of times in the same way.

[15:14] <arigo> one ref == 4 bytes == one int

[15:15] <arigo> ah, no I see now

[15:15] <stakkars> if you have an operation like "v1 = add(v2, v3)"

[15:15] <stakkars> then you create this object over and over.

[15:15] <stakkars> But it is likely that this melts down to like

[15:15] <stakkars> vars[3] = add(vars[5],vars[7])

[15:16] <stakkars> so I store the tuple ("add", 3, 5, 7) only once for all and re-use it.

[15:16] <arigo> ok.

[15:16] <arigo> the first thing to try however,

[15:16] <arigo> is to drop __slots__ here and there in the source and see how it helps

[15:16] <arigo> typically on Variable, Constant and SpaceOperation

[15:16] <stakkars> this is a little bit of extracting structure.

[15:17] <stakkars> Ok, that's of course what I want to do first, becuase it doesn't change any code.

[15:17] <stakkars> another thing that I want to do is to use local name spaces per flowgraph.

[15:18] <stakkars> We have something around 250000 discoint strings for the names. This can be folded very much, too.

[15:18] <arigo> yes

[15:18] <stakkars> It just affects the way how we generate new variales.

[15:19] <arigo> feel free to rename variables in the flow graphs

[15:19] <stakkars> the reason why I'm after this is that even with my fairly good machine I got into disk swapping.

[15:19] <stakkars> ok, fine.

[15:19] dialtone (~dialtone@host111-56.pool80117.interbusiness.it) joined #pypy.

[15:19] <stakkars> the nameswill be interned into a dict in the Variable class, uniqueness is done by a local namesdict.

[15:19] <arigo> I was a bit confused by the fact that pygame shows names like v98738 and then the C code has names like v5. If it's easy to keep them in sync, it's all positive.

[15:20] <stakkars> this was a thing that you criticized a longer while ago, when I put too much effort into

[15:20] <stakkars> nice naming of variables,but not for the flowgraphs themselves.

[15:21] <stakkars> I will probably stick with the extra renaming, but try to generate the same by default, maybe drop my renaming then.

[15:21] <arigo> yes, now I can see that both solutions have advantages.

[15:22] <arigo> but I don't remember a situation where it was nice to have a globally unique variable name; we always know in which graph to look first

[15:22] <arigo> and if you really need a globally unique name, the Variable identity is that.

[15:22] <stakkars> I was not sure in the first place if early renaming was a good idea. I though that we might go at

[15:22] <stakkars> some place and merge graphs together. But meanwhile I know that it is easy to rename later.

[15:23] <stakkars> I will keep the current scheme as a fall-back. The Variable names dict is used when no namespace is passed in.

[15:23] <arigo> yes, and if you inline you probably need to rename anyway, because you can inline the same thing several times.

[15:23] <arigo> ok

[15:23] <stakkars> exactly. Took a littletime to understand these mechanics, tho

[15:25] <stakkars> "both solutions": what are you referring to?

[15:26] <stakkars> about renaming strategies, or compactness?

[15:34] <arigo> renaming

----- silence for 41 minutes -----

[16:15] arigo (~arigo@c-3a8b70d5.022-54-67626719.cust.bredbandsbolaget.se) left irc: Read error: 110 (Connection timed out)

[16:18] pedronis (~Samuele_P@c-3a8b70d5.022-54-67626719.cust.bredbandsbolaget.se) joined #pypy.

[16:32] <stakkars> pedronis: Hi

[16:32] <pedronis> hi

[16:32] <stakkars> I see names like "Annonate" around a recent checkin. Do I rename it to Annotate, or is it intended?

[16:34] <stakkars> (just say "+") and I'll commit :-)

[16:34] <pedronis> +

[16:40] aleale (~redorlik@cpe.atm0-0-0-129140.0x3ef2fa3a.bynxx3.customer.tele.dk) left irc: "Eject! Eject! Eject!"

[16:44] <stakkars> bug in ctyper line 144

[16:46] <stakkars> done, test_typed works again

----- silence for 57 minutes -----

[17:43] stakkars_ (skgjenn@i3ED6B417.versanet.de) joined #pypy.

[17:43] stakkars (xwcznl@dsl-62-220-9-38.berlikomm.net) left irc: Read error: 104 (Connection reset by peer)

[17:50] Nick change: stakkars_ -> stakkars

[17:57] <stakkars> was myself :-)

[17:59] <pedronis> stakkars: ctypes it seems did not have set:eol-style set, now it has a mixture of windows and non-windows line endings

[18:02] <stakkars> will check that.

[18:02] <stakkars> ctyper?

[18:07] <pedronis> ctyper.py, yes

[18:22] <stakkars> did it disappear for you?

[18:29] <pedronis> no

[18:32] <stakkars> ok, now.I had to trick my editor to actually write the file :-)

[18:33] <pedronis> thanks

----- silence for 22 minutes -----

[18:55] <stakkars> pedronis: you know that genc produces wrong c code at them moment, I guess?

[18:57] <pedronis> have you something particular in mind? we know that it has loose ends etc

[18:57] <stakkars> I generated targetpypymain

[18:57] <pedronis> well, we very far for generating working code for that

[18:58] <pedronis> we are

[18:58] <stakkars> yeah, it was just better before, without errors. Will look later,

[18:58] <stakkars> at the moment I'm compacting stuff with very much success.

[18:59] <stakkars> I meant syntax errors

[19:06] <stakkars> hey, I saved 125 MB by just introducing a few __slots__

[19:09] <pedronis> yes, we thought about that

[19:10] <pedronis> but we need to be careful and run the tests

[19:10] <stakkars> this really turns my into machine into 99% computation speed, instead of a swapper nightmare

[19:10] <pedronis> because some code attaches attributes at a later time to model objects

[19:10] <stakkars> well, yes.I thought targetpypymain was a very good test

[19:11] <stakkars> ok, yes. And at some place, you are needing link.__dict__, is this necessary? I'd like to slottify links, too

[19:14] <stakkars> test_lltyped has a graph.show() left from debuggin :-)

[19:14] <pedronis> I can fix that

[19:15] <stakkars> ok, 1234 passed, 4 faild,like before

[19:15] <pedronis> I'm about to check-in some tests and support for low-level types annotation

----- silence for 24 minutes -----

[19:39] <stakkars> hum. It is not that trivial to capture the Variables with namespaces.

[19:40] <stakkars> They are created in contexts where we don't have a flowgraph sitting around I guess

----- silence for 26 minutes -----

[20:06] arre (ac@ratthing-b3fa.strakt.com) left irc: "using sirc version 2.211+KSIRC/1.3.11"

[20:06] <pedronis> stakkars: I have solved, I think, the problem with links, so that they can use slots too

[20:07] <stakkars> oh great! They might be worthit, becuase there aremore links than blocks.

[20:11] <pedronis> stakkars: checked in

[20:20] <stakkars> not that much, but beverless another 16 MB

[20:21] <stakkars> so maybe I should do the blocks, too, if it doesn't hurt.

[20:25] lac (~lac@lac.silver.supporter.pdpc) left irc: "using sirc version 2.211+KSIRC/1.3.11"

[20:34] Nick change: pedronis -> pedronis_afk

[20:37] <stakkars> well, very good. The savings factor is 1.73 just by the slots :-))

[20:39] <stakkars> (or better to say,memory usage is down to 0.58 in annotation.

[20:39] <stakkars> in genc, it is a bit worse, probably because other stuff is involved.

----- silence for 16 minutes -----

[20:55] <pedronis_afk> stakkars: slottifying the stuff in annotation.model may make sense too, although things are attached to them a bit here are there so we should be extra careful

[20:56] <stakkars> ah I see

[20:56] <pedronis_afk> with annotation there is one of those per var

[20:56] <pedronis_afk> and given debugging even more

[20:57] <stakkars> if it doesn'thurt,just fine.

[20:58] <pedronis_afk> no my point was more that finding out what slots are needed at which level of the hierarchy will be a bit more involved

[21:04] <stakkars> well, I'll just stick slots at them until it breaks no longer

----- silence for 17 minutes -----

[21:21] mayall (~mayall@212-70-201-150.ath.dialup.tee.gr) joined #pypy.

[21:21] pedronis_afk (~Samuele_P@c-3a8b70d5.022-54-67626719.cust.bredbandsbolaget.se) left irc: Read error: 110 (Connection timed out)

[21:21] <mayall> Hi

[21:22] <mayall> Anybody here?

[21:33] arigo (~arigo@c-3a8b70d5.022-54-67626719.cust.bredbandsbolaget.se) joined #pypy.

[21:37] <mayall> arigo?

[21:37] <arigo> yes

[21:38] <mayall> Hi. I'm the author of pyvm

[21:38] <mayall> I'd like to ask..

[21:38] <mayall> Does pypy have a compiler that makes 2.4 bytecodes?

[21:39] <arigo> good question

[21:39] <mayall> or at least 2.3, 2.2??

[21:39] <arigo> ah

[21:39] <mayall> :)

[21:39] aleale (~redorlik@cpe.atm0-0-0-129140.0x3ef2fa3a.bynxx3.customer.tele.dk) joined #pypy.

[21:40] <arigo> we're essentially using CPython's own, but we have a choice of two Python implementations of the 'parser' module

[21:40] <arigo> so yes, indirectly:

[21:40] <arigo> it's possible to use the parser Python implementation and then the pure Python stdlib 'compiler' package to go from source to bytecode.

[21:41] <mayall> Hmm....

[21:41] <arigo> the parsers, as I understand, are both rather flexible. Upgrading them to 2.4 syntax is probably trivial.

[21:42] <mayall> Indeed.

[21:42] <mayall> Does pypy depend on a specific version of bytecode?

[21:43] <arigo> not specifically. it supports the 2.4 opcodes if it encounters them.

[21:44] <arigo> (to be more precise about my previous answer: the current parser, developped by ludal, works by loading a Python 'Grammar' file -- so you can parse whatever Python version you like by providing the corresponding 'Grammar' from the CPython distribution)

[21:46] <mayall> The grammar file seems to be more about the lexical analysis. Getting to bytecode (and probably doing AST peephole optimizations) seems rather more complicated.

[21:46] <arigo> there is the standard library 'compiler' package for that

[21:48] <mayall> If python's compiler is about 3000 lines of C, then an implementation in python (pypy:) would be, .. what .. 400 lines?

[21:50] <arigo> compile.c is 6700 lines; the whole Lib/compiler/ subdirectory is 5800 lines... not sure what that tells, though.

[21:51] <arigo> probably that Lib/compiler/ is written in a very Java-ish style, even with some files automatically generated.

[21:52] <arigo> I'm sure a much shorter implementation would be possible, but our focus is more on semantics than syntax so far...

[21:54] <mayall> I see... So using bytecode is not so essential for pypy. You just care mainly about Python Source Code. Yes?

[21:55] <arigo> yes and no :-)

[21:56] <mayall> :) Let me remind you that bytecode is the best choice for dynamic code execution (exec, eval, etc)

[21:56] <arigo> we use bytecodes only, and compile from source to bytecode by "cheating" i.e. asking CPython to do it for us (though that's being worked on now).

[21:56] <arigo> but the specific bytecode format is not essential, as long as the interpreter and the bytecode compiler agree, of course.

[21:57] <arigo> we focus on "semantics" in the sense that PyPy provides a correct object library with the correct types, built-in modules, etc.

[21:58] <arigo> the interpreter is very much interchangeable; you can plug one that would recognize a wholly different bytecode format, for experimentations.

[22:00] <mayall> Like phsyco and stackless?

[22:00] <arigo> ah, that's still different :-) Psyco and Stackless aren't introducing a new bytecode format

[22:01] <arigo> they work by a change of semantics on what the bytecode operations do, essentially

[22:02] <arigo> the common point between Psyco and Stackless and PyPy is precisely that they are rather unrelated to issues bytecode format fine-tuning.

[22:02] <mayall> Sorry for the delay, I'm still trying to parse the concept:)

[22:02] <arigo> :-)

[22:04] <mayall> Ok. Question 2: I would like very much to find some big test cases. The entire pypy perhaps. What do you suggest? Possible?

[22:04] <arigo> sure, we're using the entiere PyPy as a test case for some of our own tests :-)

[22:05] <mayall> But is there a libpython dependancy?

[22:05] <arigo> no, pypy/interpreter/py.py is a pure Python program

[22:05] <arigo> doesn't even use that many C extension modules

[22:06] <mayall> Good!

[22:06] <arigo> you're looking for test cases for bytecode compiler/optimizers?

[22:07] <mayall> I'd like a *real* program with few deps of C modules

[22:07] <mayall> A compiler would be good so I could test exec

[22:08] <mayall> (The program that prints itself forever, for example:)

[22:09] <arigo> actually, I don't know about pyvm

[22:09] <mayall> Don't you read c.l.py?

[22:09] <arigo> generally not

[22:09] <mayall> http://students.ceid.upatras.gr/~sxanth/pyvm/

[22:10] <mayall> take your time...

[22:14] <arigo> is that related to pyvm.sourceforge.net?

[22:15] <mayall> NO!

[22:15] <arigo> ok

[22:15] <mayall> Misnomer

[22:15] <arigo> indeed

[22:15] <arigo> I'd be more interested in looking at sources than trying benchmarks with a precompiled binary, though :-)

[22:16] <mayall> Definitelly. If I was in your place I'd refuse to even try it

[22:17] <mayall> I just want to ensure that it doesn't get burried under the noise

[22:17] <arigo> is that a complete replacement for CPython?

[22:18] <mayall> Could be. With the help of others. Too much work for one person. Right now it's a replacement of Python 0.9 :)

[22:19] <arigo> definitely too much work for one person

[22:19] <arigo> though you're getting good speed-ups

[22:19] <arigo> what is the motivation?

[22:20] <mayall> Just needed some bytecode engine for an irrelevant project.

[22:20] <mayall> Then I wanted to surprise Tim Peters:)

[22:22] <mayall> BTW, I believe I can get more speed. At some point I should stop with all these optimizations and go for something more important.

[22:23] <mayall> Which is the batteries...

[22:29] arigo (~arigo@c-3a8b70d5.022-54-67626719.cust.bredbandsbolaget.se) left irc: Remote closed the connection

[22:29] arigo (~arigo@c-3a8b70d5.022-54-67626719.cust.bredbandsbolaget.se) joined #pypy.

[22:30] <arigo> hum, my X is not very stable...

[22:30] <mayall> phew

[22:40] Action: arigo reading the comp.lang.python thread

[22:40] Action: mayall knew it

[22:41] <arigo> you never said what kind of tricks you used to make pyvm faster; depending on the answer, they could be useful for CPython, for PyPy, or both

[22:42] <arigo> (e.g. PyPy's ambitions are to integrate Psyco techniques, which don't depend on bytecode-level optimizations)

[22:42] <mayall> I said that it's built from the scratch. That's enough IMHO. And I am a very experienced programmer....

[22:42] <mayall> Psyco is good at exposing staticness

[22:42] <arigo> do you know the CPython core?

[22:43] <mayall> Of course! I've been browsing at Cpython all the time.

[22:43] <arigo> I think lots of experienced programmers have put efforts in there and they (and me!) would be interested in a more technical answer to that question :-)

[22:44] <arigo> at least, on which level you're significantly differring from the CPython approach

[22:44] <mayall> That proves that pyvm is good work! ...

[22:44] <mayall> For one, it is stackless (actualle a hybrid)

[22:45] <mayall> The source *will* be open ...

[22:45] <arigo> :-)

[22:45] <arigo> do you use the same bytecode format?

[22:46] <mayall> Yes. Mod minior transformations. I'm anxiously looking forward to what AST can offer.

[22:46] <arigo> also -- not really talking for myself, but you should've realized by now that you'll loose interest from this community if you just wants to keep things closed.

[22:47] <mayall> I know. But there are about 20 people who are ***very*** interested to see WTF is going on inside pyvm.

[22:47] <mayall> Once they see they'll go back hacking their own vms

[22:48] <mayall> In that case, i'd rather kill the project right now

[22:48] <arigo> as someone pointed out, you should also explain if, and if not why, it's possible to put the same optimizations you did into the core of CPython

[22:48] <mayall> Because patches are stalled forever on sourceforge?

[22:49] <arigo> answer on comp.lang.python :-) and a more technical answer would be nicer :-)

[22:49] <mayall> The technical answer is that: in a couple of weeks, pyvm will be open and you'll see.

[22:50] <arigo> ok, good enough.

[22:50] <mayall> Thanks :)

[22:51] <mayall> So... about PyPy...

[22:51] <mayall> Could it be useful for me (or pyvm useful for pypy)?

[22:52] <arigo> well, as I said, I need more technical info to answer this question :-)

[22:52] <mayall> You give it bytecode and it runs the bytecode :)

[22:53] <arigo> if you're looking for a benchmark, PyPy uses lots of parts of the language. It'd be a good completeness test.

[22:54] <mayall> Do you use lots of : new style classes, operator overloading, weak references, closures?

[22:54] <arigo> all of that apart from weak references

[22:54] <mayall> Phew!

[22:54] <mayall> I'm thinking about driopping weakrefs.

[22:55] <mayall> Do you use MRO?

[22:55] <arigo> yes, we actually don't support them in PyPy either, at the moment

[22:55] <arigo> custom MROs ? I don't think we use that (we support that, though)

[22:56] <mayall> Basically I mean that super() can do the right thing in diamond hierarchies.

[22:57] <arigo> ah, ok... not exactly sure about that. We have only few cases of multiple inheritance

[22:57] <arigo> ah no, there is a crucial usage somewhere :-)

[22:57] <arigo> a 10-liner pure-python implementation of multimethods :-)

[22:58] <mayall> OT: If you are busy with something else feel free to tell me. I feel like I may be wasting your time.

[22:59] <mayall> Does PyPy use python's "batteries"?

[22:59] <arigo> no, I'm busy finishing to read the thread on comp.lang.python :-)

[22:59] <arigo> PyPy can "cheat" and expose the underlying CPython's C modules, yes

[23:00] <mayall> There's an interesting thread "mix/max on a list" :)

[23:01] <mayall> I guess you don't use any "frame objects", which is definitelly good for me

[23:03] <arigo> in PyPy?

[23:03] <mayall> yep

[23:03] <arigo> no, we do have frame objects

[23:03] pedronis_afk (~Samuele_P@c-3a8b70d5.022-54-67626719.cust.bredbandsbolaget.se) joined #pypy.

[23:04] <mayall> I'm still a bit confused with how you bootstrap PyPy

[23:05] <arigo> that's definitely a confusing subject in PyPy :-)

[23:05] <arigo> we load and initialize the whole interpreter and libraries in memory on top of CPython

[23:06] <arigo> then we compile it from the in-memory bytecode into C

[23:06] <mayall> Hmmm. So if I have the bytecode as a C-array, I could bootstrap in pyvm. ...

[23:08] <arigo> yes, we could generate the bytecode back to disk, or whatever

[23:08] <mayall> OK. I'll give it a try.

[23:09] <arigo> there are practical problems, though

[23:09] <arigo> we have references to lots of CPython built-in functions and modules, at this point

[23:09] <mayall> I can provide some of those.

[23:10] <arigo> ok, so yes that's definitely possible (but some work).

[23:10] <mayall> where's the download link btw?

[23:10] <arigo> http://codespeak.net/pypy/

[23:11] <arigo> you might want to compare with pypy/translator/geninterplevel.py, which translates our data back to Python sources (in a custom format -- but with a few changes it could generate plain Python again, or directly bytecodes)

[23:12] <mayall> Ok. But I still can't find the link!

[23:13] <arigo> http://codespeak.net/pypy/index.cgi?doc/getting_started.html

[23:13] <arigo> (there is no tarball distribution for now)

[23:14] <mayall> Alright. It'll take some time until I find a svn client for windows

[23:15] <mayall> And much more time until I figure out what's going on with PyPy :)

[23:15] <arigo> :-)

[23:16] <mayall> 221 c-ya!

[23:17] mayall (~mayall@212-70-201-150.ath.dialup.tee.gr) left irc: "ChatZilla 0.9.52B [Mozilla rv:1.6/20040113]"

----- silence for 34 minutes -----

[23:51] aleale (~redorlik@cpe.atm0-0-0-129140.0x3ef2fa3a.bynxx3.customer.tele.dk) left irc: ":-) :-> ;-) :) "Smilies everyone, Smilies" Mr.Rourke"

[00:00] --- Thu May 12 2005