Squeak SmalltalkJoker Squeak Smalltalk : Projects : prevnext Marvin Self For Squeak Pavel Krivanek

What’s Marvin?

Marvin is a Self dialect which combines characteristics of Self 
programming language and Smalltalk-80.

What is the status of this project?

It’s bleeding edge. It’s not usable for practical use but there’s just 
a functional implementation which can compile end execute code.

Why new language?

Smalltalk isn’t suitable for prototype-based systems. It has no 
literals for objects and you have to use pseudo-variable "self" 
extremely frequently. Self programming language can be hardly 
integrated with Squeak.

In shortcut, Marvin is Self with Squeak literals and conventions. It’s 
integrated with Squeak environment.

Does Marvin have any special interpretation layer?

No, it is compiled directly to the native bytecodes of Squeak and 
executed by virtual machine.

Do I need any special version of virtual machine?

Yes, you need a virtual machine with enhanced sending mechanism.

What are the VM modifications?

Currently Marvin adds two new primitives for definition of prototype 
class and it modifies the sending and resending mechanism. If the 
receiver of message is the prototype class, VM uses another lookup 
algorithm based on delegation.

Is this special virtual machine slower?

Of course yes. This test consumes about 2% of speed now but after 
optimalization there may be only one additional comparison of two 
integer values. So there will be no relevant slowdown.

What’s the prototype class?

Instances of this class (MarvinPrototype) are prototypes – objects 
with slots. Prototypes have no instance variables.

How delegation works?

When you send a message to a prototype, it seeks through its slots and 
if it finds matching slot, it does due operation – read/write value of 
slot or calls compiled method stored in slot. If no matching slot is 
found, this process continues with objects referenced by parent slots.

What’s the physical structure of prototype?

In fact, prototype has indexable pointer variables (like Array). It 
has four sections of slots and every section is separated by nil. 
Every slot takes 1-3 elements in dependence on slot type:

method slot:

- take 2 elements
- the first element is a reference to method selector (like #method)
- the second element is reference to compiled method
- if sent message selector reference is same as the first element, the 
  compiled method is executed

writeable data slot

- take 3 elements
- the first element is reference to read message selector (like 
  #variable)
- the second element is reference to write message selector (like 
  #variable:)
- the third element is reference to slot value
- if sent message selector is same as the first element, the result of 
  message send is the third value
- if sent message selector is same as the second element, VM takes the 
  argument from stack and stores it into the third element

read-only data slot

- has the same structure as writable data slot.
- the second element refers the same object as the first element (read 
  message selector)

parent slot

- can be read-only or writeable
- has the same structure as data slots (3 elements)

The order of slots is:

- parent slots
- method slots
- data slots
- indexable slots

For example, the Selfs object

( |
parent* = lobby.
method = ( 3+4).
x <- 5.
y = nil.
| )

contains an array of references to this objects:

01: symbol #parent
02: symbol #parent
03: object lobby
04: nil (separator)
05: symbol #method
06: complied method
07: nil (separator)
08: symbol #x
09: symbol #x:
10: number 5
11: symbol #y
12: symbol #y
13: nil (value)
14: nil (separator)

Prototype with no slots is an array with three nils.

Why read-write data and parent slots contain both selectors and not 
only the slot name?

It’s speed optimalization. Virtual machine can simply compare only 
references to selectors and it doesn’t have to concatenate strings and 
compare them.

Why read-only data and parent slots don’t take only 2 elements - slot 
name and value?

We don’t have to establish next two types of slots and every type of 
slots has fixed size.

What are the indexable slots?

The delegation lookup stops at the third separator (nil). The rest of 
prototype can contain arbitrary references. Prototypes so can be used 
as collections. The only disadvantage is that we don’t know index of 
the first element and we have to sequentially find the position of the 
last separator.

Can we create prototypes without Marvin compiler?

Yes, it can be build directly from an array.

p := MarvinPrototype withAll: #( nil nil #x #x: 56 nil).
p x --> 56
p x: 42.
p x --> 42

Or you may use this way:

p := MarvinPrototype new.
p AddAssignSlot: #x value: 56.

Can prototypes refer standard Smalltalk methods?

Yes with one limitation. You may use something like:

lobby := MarvinPrototype new.

lobby AddMethodSlot: #slotNotFound: value: (MarvinPrototype class >> 
#lobbyDNU:).

but if this compiled method contains super send, you have to put 
reference to the method owner prototype as the last literal (it’s 
method holder class in Smalltalk).

Why I have to specify owner of the compiled method?

Because virtual machine modifies super sends too. When the receiver is 
the prototype class, the super send bytecodes are interpreted as 
resends.

What’s resend?

It’s modified delegation send. The lookup doesn’t start from the 
receiver slots but from parents of object in which the executed method 
is defined.

Does Marvin have temporary variables in methods and blocks?

No, it uses slots like Self, so if we have this Smalltalk method

sum
| a b |
a := 3.
b := 4.
^ a + b

Marvin’s equivalent method slot is:

sum = ( | a. b |
a: 3.
b: 4.
^ a + b )

However local slots of methods and blocks are simulated by temporary 
variables.

Self methods return result of the last expression implicitly. Why 
there’s an explicit return here?

That’s because Marvin uses Squeak conventions for methods and blocks. 
Methods return receiver implicitly and block return result of the last 
expression or nil (if they are empty).

Can you specify a value of local slots of blocks and methods in 
definition?

Yes, the previous example can look like:

sum = ( | a = [3]. b = [4] |
^ a + b )

Why this ugly square brackets?

If you specify slot value in Self, the assigned expression is 
evaluated in compile time in context of lobby so if you create this 
object:

( | slot = self | )

it contains one slot, named “slot” with object lobby! Square brackets 
separate it visually.

Square brackets also help to have unambiguous grammar.

In current version the slot can contain only a result of single 
expression (like in Self), but in the next versions it will be whole 
expression sequence (including slot definition).

Ok, but is it readable if you write object literals?

No. Look at this example (delegation demonstration, the result is 10).

(|
parent* = [ (|
a = [6]
|) ].
a = [3].
b = [4].
sum = ( ^ resend a + b)
|) sum

Maybe we will include some syntactical shortcut for sequence of three 
characters [ ( | .

Can I use a block in slot value?

Yes, you can. An example:

(|
sum = [[ |:a. :b | a + b ]].
test = ( ^ sum value: 3 value: 4 )
|) test

Notice that this construction fails in Self because the block in slot 
expires after compilation.

How methods with arguments are defined?

Unlike Self, Marvin has only one way how to define methods 
(Smalltalk’s way)

(|
a = [3].
sum: b = ( ^ a + b )
|) sum: 4

Can objects contain code?

No, only blocks and methods can contain code (like in Self)

Can methods use methods?

No, nested methods are forbidden (like in Self)

Can methods and blocks contain parent slots?

No, unlike Self. Marvin uses standard Squeak blocks and it has no 
activation objects (parent slots are added to activation objects in 
Self). But it’s not important limitation.

Are there any special naming conventions for slot names (methods)?

Marvin is fully integrated with Squeak so it has to use Squeak naming 
conventions. So it uses ifTrue:ifFalse and not ifTrue:False: etc. The 
only convention it uses is that names of primitive methods begin with 
capital letter.

Does Marvin have literals for characters, arrays etc.?

Yes, Marvin uses all Squeak literals except scaled decimals and 
expression arrays. Both will be added in next versions. Marvin creates 
these literals as standard Smalltalk objects.

Does Marvin have its own class for blocks?

No, even block are standard Smalltalk objects so you can use all its 
capabilities like multitasking.

(|
parent* = [ lobby ].
test = (
| number = [50]. result |
[
result: number factorial.
inform: result asString
] forkAt: Processor userBackgroundPriority )
|) test

Does Marvin have object and slots annotations like Self?

No, it hasn’t. It would be problematic for implementation and it may 
be simply replaced with more general annotation protocol analogous to 
class organizations. It’s not important to support it in grammar.

Does Marvin have primitive methods?

If there’s no matching slot found during delegation, virtual machine 
tries to find method in prototype class (like in any other Smalltalk 
object). That’s why prototypes are very familiar with Squeak and you 
can print them, explore them etc.

Marvin doesn’t need classical primitive methods like Smalltalk because 
it can use standard Squeak infrastructure.

Can I use Squeak classes in Marvin programs?

Yes, if you wish. If your object has lobby as its parent, it can 
access to global objects in Smalltalk system dictionary.

Can I use standard Smalltalk objects and classes as parents of 
prototypes.

It’s very problematic operation. We may theoretically delegate 
Smalltalk classes but only in case it has no instance variables. The 
current implementation tries only to resend messages to standard 
objects referred by parent slots, but this operation even doesn’t work 
well.

It’s maybe the most limiting aspect of Marvin’s design.

Are comments in Marvin the same as comments in Smalltalk and Self?

Marvin uses standard comments. Moreover its lexical analyzer supports 
line comments (something like // in C++). They start with doubled 
quotation marks.

(|
a = [3]. "" line comment
b <- [4]. "block comment"
|)

Can I use Unicode?

Yes, when you use Squeak 3.8 you can write non-ASCII characters in 
string literals, comments etc. You cannot use Unicode in identifiers 
(unlike Squeak).

Does Marvin full tree search during delegation?

No, it doesn’t. Unlike Self. Marvin doesn’t check ambiguous calls and 
use only simple Depth First Search algorithm. It depends on parent 
slots order. However thanks this property Marvin is more flexible in 
redefinition of namespaces.

What are the benefits of Marvin?

It brings the power of classless programming in Squeak in the form, 
which can be very familiar to current Squeakers and can combine the 
main advantages of Squeak and Self in one compact system.

Squeak gets multiple inheritance, dynamic inheritance, namespaces, 
mixins etc. It can be very useful especially for UI projects like 
eToys.

Is there any outliner?

No, still isn’t. Now you can only evaluate code (directly or using 
SmaCC GUI)

Is there any decompiler and debugger?

No, it isn’t.

Are examples in this document runable?

Yes, they are.

Can we see any more advanced example?

Yes, this example (http://www.comtalk.net/Squeak/95) shows how modules 
can be implemented in Marvin. There are two separated parts of 
demonstration system – application and kernel. Application asks kernel 
if it can load a module (the set of traits and globals) and the 
kernels loads a copy of module into the application’s lobby. So 
application has its own namespace and can make modifications of its 
modules without effect on the kernel or other applications. It’s a 
kind of sandbox.

Where can I find sources?

Here: http://www.comtalk.net/Squeak/95

Is there any prepared virtual machine?

Yes, but only for Windows, sorry. My attempts to build VM for Linux 
failed (I have Gentoo on amd64).

What if I want to build VM for Linux?

Use standard build process using VMMaker. Marvin’s VM for Windows uses 
older version of VMMaker than the Linux versions so be sure all 
Marvin’s modifications are compatible with your version. If you 
success, please publish binaries.

Known problems?

I don’t know any part of current implementation which isn’t 
problematic :-)

Your help is needed!

-- Pavel Krivanek