[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Question about CMUCL type inference
On the cmucl-help@cons.org mailing list, on Thu, Aug 16, 2001 at
07:17:10PM +0200, Nicolas Neuss wrote:
> In a discussion on the PLT Scheme mailing list about type inference I
> asked Shriram Krishnamurti about the relation of his ideas to CMUCL
> type inference:
>
> http://www.cs.utah.edu/plt/mailarch/plt-scheme-2001/msg00799.html
>
> He answers (not yet in the archive, so I repeat it here):
>
> ----------------------------------------------------------------
> Nicolas Neuss wrote:
>
> > You certainly know the type inference which CMUCL is using.
>
> No, I don't. That is, I know it's there, and I've seen examples of
> it, but I don't understand how it works. (I've read the manual. This
> gives me several examples, but no spec. Maybe the manual has changed
> since I last read it?)
>
> I don't know what mechanism they use to implement type inference, and
> how it stays tractable in the presence of union types and range types
> and so forth. I don't fully understand "dynamic type inference"; it
> looks more like "static flow analysis" to me. Certainly what they
> accomplish looks impressive -- eg, if-splitting followed by
> optimizations -- but how?
>
> Surely someone on this list can point me to a technical paper that
> explains exactly what they do? I'd be much obliged.
>
> Shriram
>
> ----------------------------------------------------------------
>
> I guess people on this list are most suited to answer his question.
> Could someone do so - best cc-ing also to
> plt-scheme@fast.cs.utah.edu ?
I'm not aware of anything published about the CMU CL type inference
system.
I'm not sure what "dynamic type inference" would mean, and I agree
that what's going on looks like (my compiler non-guru's idea of)
static flow analysis. Most of the type information seems to flow from
* type declarations in the user source code (of course)
* type declarations built into the CMU CL system (mostly with the
DEFKNOWN macro)
* type inference based on argument types, mostly through things
defined with the DERIVE-TYPE variant of the DEFOPTIMIZER macro
Then the type information is used by various compiler transforms
and primitive operations which have type constraints on their
operands.
I don't think there's a single simple answer to the question about how
the type inference stays tractable in the presence of union types and
so forth. Two parts of the answer are:
1. The algebra of types is implemented in a hand-rolled object
system. The methods for some operations (e.g. type intersections,
and to some extent type unions) punt, representing the result of
the operation as a type which is sort of opaque, so that all
further operations on it yield other opaque types, or
"don't know" answers to boolean queries. This punting tends to
limit the complexity of nonopaque types. If you want to see how
it does it in a particular case, you can search the sources
for uses of the macro DEFINE-TYPE-METHOD, which is used to
define the methods. Unfortunately the hand-rolled object system
is (in my opinion) a mite confusing. (I wish it were CLOS, but
since I'm pretty sure it predates CLOS by many years, that's
of course unreasonable.) Still, you can see a lot of what's going
on by looking at the methods without understanding the method
combination system in any detail.
2. I think the optimizer uses some sort of cutoff to stop refining
type information after a certain number of passes. I can't give
you a citation for this (to code or documentation), but I
think I've seen it somewhere, and it seems to explain why
the system doesn't want to optimize some complicated code for
me, and finding and changing this limiting behavior is (somewhere
fairly far down) on my to-do list. I suspect that this may be
important in order to keep the optimizer from getting stuck
in problems where it refines e.g. (RATIONAL 0 1/2) to
(RATIONAL 0 1/4) on the first pass, then to (RATIONAL 0 1/8)
on the next pass, then to (RATIONAL 0 1/16) on the next pass,
and so on ad infinitum without being brilliant enough to jump
to the limit (RATIONAL 0 0).
My answer to #1 is based on a fairly good understanding of the type
system, since I rewrote the implementation of type intersections for
the SBCL fork of CMU CL. My answer to #2 is, obviously, based on much
more superficial knowledge of the optimizer.
The system claims to adhere the principle that "declarations are
assertions", and in fact mostly does, so it seems as though it would
be nice match to Scheme R4RS, which is the last version I'm familiar
with. (In Scheme, you could just run the principle backwards,
optimizing code downstream from a type check/assertion with type
knowledge based on the fact that the type check succeeded.)
--
William Harold Newman <william.newman@airmail.net>
"Three hours a day will produce as much as a man ought to write."
-- Anthony Trollope
PGP key fingerprint 85 CE 1C BA 79 8D 51 8C B9 25 FB EE E0 C3 E5 7C