Below is part 1 of my notes from installing and using the most recent
version of Magma this morning. They are literally just a log of what I am
doing and what successes and failures I come across, so they will probably
be boring for the casual reader, but I hope they provide you with some
useful info.
Started by installing the Magma server package into a fresh 3.7b-5969
image, and then saving two copies: magmaserver.image and
magmaclient.image. Both are running on Squeak 3.7.1beta2 Carbon VMs on
Mac OS X 10.3.
Following the instructions at http://minnow.cc.gatech.edu/squeak/2689, I
executed the following code in a workspace:
MagmaRepositoryController
create: 'magma/myrepos.magma'
root: Dictionary new
No errors.
I then tried this:
MagmaServerConsole new
open: 'magma/myrepos.magma'
processOn: 51969
And got the error "magma/myrepos.magma" not found.
Ok, so creating the repository didn't work. Maybe I need to create the
'magma' directory myself first? If so, I would have expected the error
to occur when trying to create the repository, not while trying to
open it, but let's try that.
Saba:~ avi$ cd Documents/Squeak
Saba:~/Documents/Squeak avi$ mkdir magma
Now trying this code snippet again:
MagmaRepositoryController
create: 'magma/myrepos.magma'
root: Dictionary new
And I get the following error:
MessageNotUnderstood: UndefinedObject>>binary
This is inside MaObjectFiler>>createDataFile.
Ok, maybe I screwed things up the first time. Looking around for a way
to reset the state, I find MagmaRepositoryController class>>initialize,
and try it. Nope, doesn't fix it. Ok, I'll trash this image and start
again.
From a fresh image, the same problem. Maybe if I create the file first?
Saba:~/Documents/Squeak avi$ touch magma/myrepos.magma
No dice.
Just as a sanity check, I try "FileStream fileNamed:
'magma/myrepos.magma'" and get back a working filestream. Poke around
in the code...
<several minutes later>
The issue seems to be a strange interaction between MaFilename and the
Carbon VM. Although FileStream knows how to expand
'magma/myrepos.magma' into a suitable absolute path for Carbon,
MaFilename doesn't (it tries to expand it into a unix filename
instead). If I use FileStream to give me back the absolute path, and
then feed that to Magma, things work. Ok, we now have a server up and
running. Good.
I'll send this before I continue, and just make the point (I hope,
constructively) that this is part of what I mean by "too much code":
Magma uses its own Filename class instead of FileStream. Now,
FileStream has its issues, and I can perfectly understand the
temptation to do that. However, FileStream is known to work, on all
platforms, on all VMs - because if it didn't, people would complain
very quickly. Because Magma rolls its own filename handling, it
doesn't get to benefit from this, and in fact, MaFilename is broken on
the platform I happen to be on. Incidentally, this is one of the
problems I ran into when trying out an early version of Magma (that
time, on Linux, I believe), and it's interesting to see that it's still
there.
Ok, now that the server is up and running, switching to the client
image. I create a session with:
mySession := MagmaSession
hostAddress: #(127 0 0 1) asByteArray
port: 51969.
mySession connectAs: 'avi'
And follow the instructions on the wiki to commit something to the
root. No errors.
I create a second session and try to pull my data out of the root.
Success! Great. That was easy.
I try modifying the root in mySession2, and then inspecting it in
mySession. The change came across. Cool.
Ok, let's try some concurrency.
mySession begin.
mySession root at: 'test' put: 7.
mySession2 commit: [mySession2 root at: 'test' put: 8].
mySession root at: 'test' "==> 7"
Good - we're inside a transaction so we don't see the change from
mySession2. What if we try to commit?
mySession commit.
No errors... what happened?
mySession root at: 'test' "==> 8"
Ok... so the commit clearly failed, due to the conflict with
mySession2, which is what I would expect. But is there a way to get a
notification of this? I would probably want to retry the transaction
when this happened.
Browsing through the MagmaSession protocol looking for a way to get
this, nothing jumps out at me. Chris?
Next, try some performance tests. Commit a medium sized
OrderedCollection:
Time millisecondsToRun: [mySession commit: [mySession root at: 'test2'
put: ((1 to: 1000) collect: [:i | i at i])]]
Executes in 2.4s. That's pretty good. What about to retrieve it?
Time millisecondsToRun: [mySession2 root at: 'test2']
Hm... very strange things are going on. This seems to fail silently -
I never get a number back. Inspecting mySession2 root shows only
'test' as a key. #basicInspect shows the same thing. I try
"mySession2 root at: 'test3'" and get an emergency evaluator, then try
it again and get the same silent-failure behavior. This doesn't bode
well.
Ok, let's try this with a third session. Aha, that worked: 621ms to
retrieve. Try it again, expecting it to be instant (since the objects
should be cached now), but get ~300ms every time. Interesting - why is
that?
I'm also curious how many of the objects in the collection actually got
brought in during that 621ms. Let's try to force them all in:
Time millisecondsToRun: [(mySession3 root at: 'test2') do: [:ea | ea
yourself]]
Again, about 300ms each time. Ok, I clearly don't understand the
caching model here.
Since I'm doing some timings, I might as well compare this to GOODS.
Start up a GOODS image...
Time millisecondsToRun: [db root at: 'test2' put: ((1 to: 1000)
collect: [:i | i at i]). db commit]
1.6s. Not too different. And retrieval?
Time millisecondsToRun: [db2 root at: 'test2']
GOODS gets 15ms on the first run, 0ms thereafter. Again, let's force
it to bring everything in:
Time millisecondsToRun: [(db2 root at: 'test2') do: [:ea | ea yourself]]
727ms the first time, 3ms thereafter. Two things we seem to be seeing
in GOODS but not in Magma: not bringing in all the objects until
they're needed (it only took 15ms to bring in the collection itself,
the big hit wasn't until accessing its members), and caching the
objects once they're there. Chris, how could I get Magma to do the
same things, if I wanted to?
Ok, that's enough for part II - time to grab some lunch. I especially
want to know what happened to mySession2, which seems to be permanently
hosed now.
(My continuing mission to explore new databases, to seek out strange
bugs, to... never mind).
Ok, the next thing I'm interested in is reliablity. IIUC, Magma
doesn't have a transaction log, which means that reliability is
definitely a worry: is it possible for me to corrupt the database? How
easy is it to lose data?
I'm starting a new session, and I'm going to leave it in a loop
committing the current time as quickly as it can:
[[mySession commit: [mySession root at: 'now' put: Time now]] repeat]
fork
Actually this is interesting - what happens when I try to read that
time, while the commit loop is running? Create a new session, try to
access that key:
mySession2 root at: 'now'
Hmm, back to the silent-failure issue from part II.
Create a third session, same problem.
Well, hm, what do I do now? I guess I can halt the ever-committing
process and see what happens then.
Open the ProcessBrowser to try to kill the process (next time, keep a
reference to it in the workspace), and the only one that seems right is
currently in UndefinedObject>>handleSignal:. Maybe we hit an error
while trying to commit? Try to debug the process, my image hangs. Ok,
trash that one.
New client image, open a new session, which seems to work - the server
survived. That's good. But there's nothing at 'now' in the root. Did
*none* of those commits work?
Let's try that again with a delay between commits.
[[mySession commit: [(Delay forSeconds: 1) wait. mySession root at:
'now' put: Time now]] repeat] fork
Same problem: trying to access 'now' doesn't work.
Well, maybe I've screwed up the root object somehow. Doesn't seem to
be a way to reset the root, so I'll start a new repository on the
server.
Start a new repository, start a server console going, try to repeat the
test, hit the exact same issue. I wonder - are there issues with
having concurrent client sessions in different threads? Instead of
forking, let's try repeating this commit some finite number of times...
1000 timesRepeat: [mySession commit: [mySession root at: 'now' put:
Time now]]
Ok, now try to access it:
mySession2 root at: 'now'
Oops, same problem. Create a new session?
mySession3 root at: 'now'
Ok, that worked.
Now, let's try killing the server while that repeating commit is going
on.
1000 timesRepeat: [mySession commit: [mySession root at: 'now' put:
Time now]]
First gently: I'm saving and quitting the server image. Some notifiers
popped up about SharedQueue not being empty, but the image quit anyway.
Just out of curiosity, try to use the client session while the server
is down. My image locks up. After a while, I manage to interrupt it,
inside a critical section in MaTcpRequestServerLink>>submit:.
Ok, I bring the server image back up, close the notifiers, and start
the console going again.
Try to connect from the client, but it locks again. Do I need to
manually signal that semaphore?
Try that, but still can't connect. Let's switch to a new client image
too.
Ok, that took quite a while, but it did connect. Try to inspect the
root, but all I get is the silent-failure thing again.
If I look back at the server image, I've got an MNU for UndefinedObject
of
#maRead: bytes:bytesFromPosition:of:atFilePosition: . Looks like the
file is nil? But it does exist:
Saba:~/Documents/Squeak/magma avi$ ls -l
total 192
-rw-r--r-- 1 avi staff 44592 9 Jul 12:40 myrepos.magma
-rw-r--r-- 1 avi staff 53184 9 Jul 14:21 myrepos2.magma
Hm... maybe starting up the server console on restart was a bad idea,
maybe it does that for me and I should leave it alone. Quit the server
image, start it up again. Use a fresh client image too.
Nope, same problem. Maybe I hosed the server image by saving and
quitting it? I'll try a brand new server image and use the same
database file.
Ok, that seems to work, and I get the root back ok. Lesson learned: do
not save and quit a running magma server.
Next question - what happens if I kill the process instead? Start the
repeating commit going, then force quit the server.
Start up a new server image. Trying to connect from the old client
image hangs; start a new client image too.
Ok, can connect, but can't get the root - same old strange
silent-failure business. Now I'm really stuck - this is a fresh server
image and a fresh client image, and all I've tried to do is connect and
look at the root. Is my data gone forever?
Well, I think that's about as far as I'm willing to go today. Frankly,
it's a lot further than I would go if I were evaluating Magma for use,
rather than trying to give as much feedback as possible - I hit enough
issues along the way that I would have long since lost the ability to
muster the 110% confidence I would need to entrust my data to it. But
I recognize that it's a work in progress, and so I hope my notes prove
to be useful in taking it further.
(My continuing mission to explore new databases, to seek out strange
bugs, to... never mind).
Ok, the next thing I'm interested in is reliablity. IIUC, Magma
doesn't have a transaction log, which means that reliability is
definitely a worry: is it possible for me to corrupt the database? How
easy is it to lose data?
I'm starting a new session, and I'm going to leave it in a loop
committing the current time as quickly as it can:
[[mySession commit: [mySession root at: 'now' put: Time now]] repeat]
fork
Actually this is interesting - what happens when I try to read that
time, while the commit loop is running? Create a new session, try to
access that key:
mySession2 root at: 'now'
Hmm, back to the silent-failure issue from part II.
Create a third session, same problem.
Well, hm, what do I do now? I guess I can halt the ever-committing
process and see what happens then.
Open the ProcessBrowser to try to kill the process (next time, keep a
reference to it in the workspace), and the only one that seems right is
currently in UndefinedObject>>handleSignal:. Maybe we hit an error
while trying to commit? Try to debug the process, my image hangs. Ok,
trash that one.
New client image, open a new session, which seems to work - the server
survived. That's good. But there's nothing at 'now' in the root. Did
*none* of those commits work?
Let's try that again with a delay between commits.
[[mySession commit: [(Delay forSeconds: 1) wait. mySession root at:
'now' put: Time now]] repeat] fork
Same problem: trying to access 'now' doesn't work.
Well, maybe I've screwed up the root object somehow. Doesn't seem to
be a way to reset the root, so I'll start a new repository on the
server.
Start a new repository, start a server console going, try to repeat the
test, hit the exact same issue. I wonder - are there issues with
having concurrent client sessions in different threads? Instead of
forking, let's try repeating this commit some finite number of times...
1000 timesRepeat: [mySession commit: [mySession root at: 'now' put:
Time now]]
Ok, now try to access it:
mySession2 root at: 'now'
Oops, same problem. Create a new session?
mySession3 root at: 'now'
Ok, that worked.
Now, let's try killing the server while that repeating commit is going
on.
1000 timesRepeat: [mySession commit: [mySession root at: 'now' put:
Time now]]
First gently: I'm saving and quitting the server image. Some notifiers
popped up about SharedQueue not being empty, but the image quit anyway.
Just out of curiosity, try to use the client session while the server
is down. My image locks up. After a while, I manage to interrupt it,
inside a critical section in MaTcpRequestServerLink>>submit:.
Ok, I bring the server image back up, close the notifiers, and start
the console going again.
Try to connect from the client, but it locks again. Do I need to
manually signal that semaphore?
Try that, but still can't connect. Let's switch to a new client image
too.
Ok, that took quite a while, but it did connect. Try to inspect the
root, but all I get is the silent-failure thing again.
If I look back at the server image, I've got an MNU for UndefinedObject
of
#maRead: bytes:bytesFromPosition:of:atFilePosition: . Looks like the
file is nil? But it does exist:
Saba:~/Documents/Squeak/magma avi$ ls -l
total 192
-rw-r--r-- 1 avi staff 44592 9 Jul 12:40 myrepos.magma
-rw-r--r-- 1 avi staff 53184 9 Jul 14:21 myrepos2.magma
Hm... maybe starting up the server console on restart was a bad idea,
maybe it does that for me and I should leave it alone. Quit the server
image, start it up again. Use a fresh client image too.
Nope, same problem. Maybe I hosed the server image by saving and
quitting it? I'll try a brand new server image and use the same
database file.
Ok, that seems to work, and I get the root back ok. Lesson learned: do
not save and quit a running magma server.
Next question - what happens if I kill the process instead? Start the
repeating commit going, then force quit the server.
Start up a new server image. Trying to connect from the old client
image hangs; start a new client image too.
Ok, can connect, but can't get the root - same old strange
silent-failure business. Now I'm really stuck - this is a fresh server
image and a fresh client image, and all I've tried to do is connect and
look at the root. Is my data gone forever?
Well, I think that's about as far as I'm willing to go today. Frankly,
it's a lot further than I would go if I were evaluating Magma for use,
rather than trying to give as much feedback as possible - I hit enough
issues along the way that I would have long since lost the ability to
muster the 110% confidence I would need to entrust my data to it. But
I recognize that it's a work in progress, and so I hope my notes prove
to be useful in taking it further.
---------------------------------------
Hi Avi, I've now had time to go through all three "Magma notes" and
thought it would be easier to summarize my findings in one note. After
reviewing all notes in detail and attempting to reproduce everything, the
quick, summary answer is, I discovered no significant issues. All the
tests that you performed either worked for me or your expectations
differed from how it works.
My main purpose in reporting this is to clarify that the current release,
"1.0gamma7" truly is "gamma" quality IMO, not "alpha" or "beta" quality as
has been characterized, at least somewhat, by the poor experience you had.
> Chris, if there are particular issues I found that you want help in
> reproducing, just let me know.
Yes, I think the next step is to try to establish reproducibility in some
of the issues you encountered. Some kind of straight-run script that
demonstrates a particular problem. My guess is, you will not be able to
produce a script that locks the image, corrupts data, produces
inconsistent results, or anything else really bad like that unless you go
outside the bounds of what Magma "supports". (I plan to add a Swiki page
later this weekend that spells out Magma's boundaries and limitations,
hopefully that will help clarify the air too).
> is it possible for me to corrupt the database?
Yes, there is no transaction logging, so a hardware failure in mid-write
would leave half a transaction committed. It is planned for the future,
but not the top priority on my list just yet.
> How easy is it to lose data?
Actually, not too easy, as long as hardware doesn't fail. But it is
advisable to take some care, of course.
Also, another good "platform test" would be to run the Magma test cases on
your Mac. I always ensure they run on Windows before I post to SqueakMap
so if it can't get through those, then there may be a platform-specific
bug in Squeak somewhere. They take a while to run, and it's not just a
click in TestRunner browser so let me know if you need any assistance
getting it set up.
And this also brings up a good point. The test-cases are pretty
stringent, covering lots of weird combinations and scenarios. By studying
those, you can see exactly what Magma is capable of, because it *does* get
through them all. Ok, what follows are the boring details for each set of
notes.
Magma notes 1:
Here the entire problem is that Magma does not support "relative path"
file names. There are just three or four messages in the entire API that
call for a filename and I always use fully-qualified names. I will add
improving this to my list but, in the mean time, use fully-qualified names
and all these problems should go away.
MaFilename is a facade for accessing the parts of filenames. I hope that
changing its implementation to rely more heavily on what FileDirectory
(you said FileStream, I presume you meant FileDirectory) will help
relative filenames work on all platforms. If I've duplicated
functionality I probably just didn't see it. I'll research that and add
that to my list of improvements.
Magma notes 2:
In this one, the only anomaly I was able to reproduce was the apparent
lack of commit-conflict detection. I was very surprised by this at first,
but I now see why. I had my MaClientServerPreferences debug set to true,
which turns off the resignaling of the exceptions that occurred in the
server (the MagmaCommitError) and simply returns them instead (hmm, now
I'm trying to remember why I wanted that behavior..).
Please see if you, too, have yours turned on and, if so, execute this:
MaClientServerPreferences debug: false. MagmaPreferences debug: false
and then it will signal the commit error instead of just returning it.
Magma notes 3:
Basically, everything you were doing here was simulating what you want to
do for a web app. I want to be clear when I say I have never used nor
tested Magma for this purpose, but, based on how its coded, I think it
*should* work.
Let me try in Windows.
Ok, I just tried the test where you commit the clock continously:
[[mySession commit: [mySession root at: 'now' put: Time now]] repeat]
fork
and, in the mySession2 inspector:
[ [ self abort. Transcript cr; show: (self root at: 'now') ] repeat ]
fork
I let it run all night long. This morning, both processes were still
chugging and the server console shows #objectCount at 84302, so an average
of about three commits per second.
> Lesson learned: do not save and quit a running magma server.
No, this works too. In fact, I just killed the server (save and exit
image). However, there *was* a problem killing the clients after the
server because they automatically try to disconnect and wait for the
server response. I've already fixed that in 1.0beta8. But bringing the
server back up and then connecting with clients did work just fine.
> Ok, can connect, but can't get the root - same old strange
> silent-failure business.
There is no no silent-failure business. If you can't get the root then
one of the server processes must've gotten killed, so you're going to be
flailing in a rut for anything you try going forward.
> Just out of curiosity, try to use the client session while the server
> is down. My image locks up. After a while, I manage to interrupt it,
> inside a critical section in MaTcpRequestServerLink>>submit:.
A ha! You ARE in debug mode because otherwise the timeout should have
occurred after 30 seconds. When in debug mode it is set to 2 days so I
have plenty of time to debug. You are running in debug mode.
> Open the ProcessBrowser to try to kill the process
This is a good way to get Magma in a funky state and, quite possibly,
stuck in a rut.
- Chris